Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8292892

Javadoc index descriptions are not deterministic

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P4 P4
    • 20
    • 15, 16, 17, 18, 19, 20
    • tools
    • 15
    • b16

      The descriptions of index entries in the JDK API Specification are not deterministic. Their differences reveal an underlying bug unrelated to the goal of reproducible builds.

      There are 55,782 index entries in the JDK 20 API Specification, but 56 of them get different descriptions depending on whether I run the build on my local workstation, on a remote Launchpad build machine, or in a QEMU/KVM virtual machine. Of those 56 index entries, 40 are static variables of classes in the package 'java.util.jar'.

      For example, each build defines a single entry for LOCCRC in the L-Index file and in the member search index, but the descriptions for the entry differ as shown below.

      Local Workstation

        LOCCRC - Static variable in class java.util.jar.JarEntry
        Search: java.util.zip.JarEntry.LOCCRC -> File not found

      Remote Launchpad

        LOCCRC - Static variable in class java.util.jar.JarOutputStream
        Search: java.util.zip.JarOutputStream.LOCCRC -> File not found

      Virtual Machine

        LOCCRC - Static variable in class java.util.jar.JarInputStream
        Search: java.util.zip.JarInputStream.LOCCRC -> File not found

      When I list the source files of the package in directory order (unsorted) on my local workstation, JarEntry is the first class found that inherits the LOCCRC variable:

        $ ls -1U ~/opt/jdk-20/src/java.base/java/util/jar/
        JarEntry.java
        package-info.java
        JarOutputStream.java
        Attributes.java
        JarInputStream.java
        JarException.java
        JarFile.java
        JarVerifier.java
        JavaUtilJarAccessImpl.java
        Manifest.java

      On the virtual machine, JarInputStream is the first such class found:

        $ ls -1U ~/opt/jdk-20/src/java.base/java/util/jar/
        JarVerifier.java
        JarInputStream.java
        JavaUtilJarAccessImpl.java
        Manifest.java
        JarOutputStream.java
        Attributes.java
        JarFile.java
        JarException.java
        package-info.java
        JarEntry.java

      At first, this issue appeared to be the usual file-ordering problem of reproducible builds. Yet the LOCCRC variable is inherited by four of the classes in the 'java.util.jar' package: JarEntry, JarFile, JarInputStream, and JarOutputStream. The variable is also inherited by four classes in the 'java.util.zip' package: ZipEntry, ZipFile, ZipInputStream, ZipOutputStream.

      The underlying issue is that the LOCCRC index entry, and others like it, should be listed once for each inheriting class. For LOCCRC, that would require eight entries. The problem seems to occur when documenting members inherited from classes with package access. Such members are to be documented as though they were declared in the inheriting class. See JDK-4780441 for details.

      SYSTEM / OS / JAVA RUNTIME INFORMATION

      System information for my local workstation running Ubuntu 20.04.4 LTS is listed below:

        $ uname -a
        Linux tower 5.15.0-46-generic #49~20.04.1-Ubuntu SMP
          Thu Aug 4 19:15:44 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

        $ ldd --version
        ldd (Ubuntu GLIBC 2.31-0ubuntu9.9) 2.31

        $ getconf GNU_LIBPTHREAD_VERSION
        NPTL 2.31

        $ $HOME/opt/jdk-20/bin/java --version
        openjdk 20-ea 2023-03-21
        OpenJDK Runtime Environment (build 20-ea+11-661)
        OpenJDK 64-Bit Server VM (build 20-ea+11-661, mixed mode, sharing)

      STEPS TO REPRODUCE

      I was able to reproduce the problem by building on three different systems with the following processors:

        * Local Workstation: 4-core Intel Xeon CPU E3-1225 v5
        * Remote Launchpad: 4-core AMD EPYC-Rome Processor
        * Virtual Machine: Single-core Intel Core Processor (Skylake, IBRS)

      I can't be certain that the processor played a role in the ordering of files in their directories, but it may have affected the timing of the process that created them.

      EXPECTED RESULTS

      The builds of the JDK are identical.

      ACTUAL RESULT

      The builds are different, but they differ only in their Javadoc API index files and the corresponding 'member-search-index.js' file. There are 56 entries in the index that differ:

        $ git diff --numstat --shortstat local remote
        11 11 {local => remote}/index-12.txt
        1 1 {local => remote}/index-13.txt
        5 5 {local => remote}/index-18.txt
        1 1 {local => remote}/index-19.txt
        2 2 {local => remote}/index-20.txt
        1 1 {local => remote}/index-21.txt
        22 22 {local => remote}/index-3.txt
        12 12 {local => remote}/index-5.txt
        1 1 {local => remote}/index-8.txt
         9 files changed, 56 insertions(+), 56 deletions(-)

      The differences occur for index entries that are identified on my local workstation as:

        * Methods in class java.awt.BufferCapabilities.FlipContents
        * Methods in class java.time.chrono.HijrahDate
        * Methods in class jdk.incubator.vector.ByteVector
        * Static variables in class java.util.jar.JarEntry

      They are:

        java.awt.BufferCapabilities.FlipContents.hashCode()
        java.awt.BufferCapabilities.FlipContents.toString()
        java.time.chrono.HijrahDate.toString()
        java.time.chrono.HijrahDate.until(Temporal, TemporalUnit)
        java.util.jar.JarEntry.CENATT
        java.util.jar.JarEntry.CENATX
        java.util.jar.JarEntry.CENCOM
        java.util.jar.JarEntry.CENCRC
        java.util.jar.JarEntry.CENDSK
        java.util.jar.JarEntry.CENEXT
        java.util.jar.JarEntry.CENFLG
        java.util.jar.JarEntry.CENHDR
        java.util.jar.JarEntry.CENHOW
        java.util.jar.JarEntry.CENLEN
        java.util.jar.JarEntry.CENNAM
        java.util.jar.JarEntry.CENOFF
        java.util.jar.JarEntry.CENSIG
        java.util.jar.JarEntry.CENSIZ
        java.util.jar.JarEntry.CENTIM
        java.util.jar.JarEntry.CENVEM
        java.util.jar.JarEntry.CENVER
        java.util.jar.JarEntry.ENDCOM
        java.util.jar.JarEntry.ENDHDR
        java.util.jar.JarEntry.ENDOFF
        java.util.jar.JarEntry.ENDSIG
        java.util.jar.JarEntry.ENDSIZ
        java.util.jar.JarEntry.ENDSUB
        java.util.jar.JarEntry.ENDTOT
        java.util.jar.JarEntry.EXTCRC
        java.util.jar.JarEntry.EXTHDR
        java.util.jar.JarEntry.EXTLEN
        java.util.jar.JarEntry.EXTSIG
        java.util.jar.JarEntry.EXTSIZ
        java.util.jar.JarEntry.LOCCRC
        java.util.jar.JarEntry.LOCEXT
        java.util.jar.JarEntry.LOCFLG
        java.util.jar.JarEntry.LOCHDR
        java.util.jar.JarEntry.LOCHOW
        java.util.jar.JarEntry.LOCLEN
        java.util.jar.JarEntry.LOCNAM
        java.util.jar.JarEntry.LOCSIG
        java.util.jar.JarEntry.LOCSIZ
        java.util.jar.JarEntry.LOCTIM
        java.util.jar.JarEntry.LOCVER
        jdk.incubator.vector.ByteVector.castShape(VectorSpecies<F>, int)
        jdk.incubator.vector.ByteVector.check(Class<F>)
        jdk.incubator.vector.ByteVector.check(VectorSpecies<F>)
        jdk.incubator.vector.ByteVector.convertShape(
            VectorOperators.Conversion<Byte, F>, VectorSpecies<F>, int)
        jdk.incubator.vector.ByteVector.convert(
            VectorOperators.Conversion<Byte, F>, int)
        jdk.incubator.vector.ByteVector.maskAll(boolean)
        jdk.incubator.vector.ByteVector.reinterpretAsDoubles()
        jdk.incubator.vector.ByteVector.reinterpretAsFloats()
        jdk.incubator.vector.ByteVector.reinterpretAsInts()
        jdk.incubator.vector.ByteVector.reinterpretAsLongs()
        jdk.incubator.vector.ByteVector.reinterpretAsShorts()
        jdk.incubator.vector.ByteVector.species()

      I attached the following two files that show all of the differences:

        * index-local-vs-remote.diff - Compares 'api/index-files/*.html'
        * search-local-vs-remote.diff - Compares 'api/member-search-index.js'

      I made the comparisons easier by converting the HTML files to plain text with 'w3m' and by expanding the JavaScript file using the 'js-beautify' tool, as in the examples below:

        $ w3m -dump -cols 10000 -T text/html index-12.html > index-12.txt
        $ js-beautify --end-with-newline -o search-local.js \
            member-search-index.js

      SOURCE CODE FOR AN EXECUTABLE TEST CASE

      I used the following shell script to narrow the scope of packages while testing:

        #!/bin/bash
        # Runs Javadoc for testing

        # The JDK home directory and its extracted source files
        jdk_dir="$HOME/opt/jdk-20"
        jdk_src="$jdk_dir/src"

        "$jdk_dir/bin/java" --patch-module jdk.javadoc=target/classes \
            jdk.javadoc.internal.tool.Main \
            --source-path "$jdk_src/java.base" \
            -d tmp/doc -notimestamp -Xdoclint:none \
            java.util.jar "$@"

      WORKAROUND

      I don't have a workaround, but I do have a fix. My pull request will follow shortly.

        1. index-local-vs-remote.diff
          8 kB
          John Neffenger
        2. search-local-vs-remote.diff
          5 kB
          John Neffenger

            jgneff John Neffenger
            jgneff John Neffenger
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: