Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8256541

Sort out what version of awk is used in the build system



    • Bug
    • Status: Resolved
    • P3
    • Resolution: Fixed
    • 16
    • 16
    • infrastructure
    • b27


      For historical reasons, there exists a variety of different implementations of awk: awk (the original implementation), gawk (the GNU version), nawk (new awk, iirc) and the lesser known mawk.

      Things are complicated by the fact that the original awk is seldom used, but instead gawk or nawk is typically symlinked to be named "awk".

      In terms of functionality there are very few differences. The original awk is most limited, while nawk and gawk is mostly replaceable.

      So the conditions for this is somewhat messy, but we manage impressively to mess it up even further. :-)

      We set up the following definitions:
      BASIC_REQUIRE_PROGS(NAWK, [nawk gawk awk])
      and AC_PROG_AWK, according to the documentation, "[c]heck for gawk, mawk, nawk, and awk, in that order".

      So, if you have nawk and awk (but no other) installed, both NAWK and AWK will be set to nawk. If you have only awk, both will be set to awk. The difference is if you have gawk installed, then NAWK will be nawk and AWK will be gawk.

      As an example, on my mac, I only have the original awk, so both AWK and NAWK will be awk.

      On my ubuntu box, things are even more confused. I have:
      $ ls -l /usr/bin/*awk
      lrwxrwxrwx 1 root root 21 Feb 6 10:36 awk -> /etc/alternatives/awk*
      -rwxr-xr-x 1 root root 658072 Feb 11 2018 gawk*
      -rwxr-xr-x 1 root root 3189 Feb 11 2018 igawk*
      -rwxr-xr-x 1 root root 125416 Apr 3 2018 mawk*
      lrwxrwxrwx 1 root root 22 Feb 6 10:37 nawk -> /etc/alternatives/nawk*

      $ ls -l /etc/alternatives/*awk
      lrwxrwxrwx 1 root root 13 Feb 10 10:56 /etc/alternatives/awk -> /usr/bin/gawk*
      lrwxrwxrwx 1 root root 13 Feb 10 10:56 /etc/alternatives/nawk -> /usr/bin/gawk*

      So awk, nawk and gawk all executes the same binary, i.e. gawk. Only mawk is different. So on that machine, AWK would be gawk and NAWK would be nawk, but both will execute gawk.

      I propose that we remove NAWK, and only use AWK, but we should stop using AC_PROG_AWK and define it in an order that is transparent to us. I recommend [gawk nawk awk], since on Linux systems nawk (as we've seen) is likely to be gawk under disguise anyway, so it's better to be clear about that.

      This reasoning assumes that the awk scripts we write are portable enough to be executed by any awk. If we run into any problem with this, we might have to restrict the variation of awks we support.


        Issue Links



              ihse Magnus Ihse Bursie
              ihse Magnus Ihse Bursie
              0 Vote for this issue
              2 Start watching this issue