Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8281962

Avoid unnecessary native calls in InflaterInputStream

    XMLWordPrintable

Details

    • Enhancement
    • Resolution: Fixed
    • P4
    • 19
    • 8, 11, 17, 18
    • core-libs
    • None

    Description

      Currently, `InflaterInputStream::read()` first does a native call to the underlying zlib `inflate()` and only afterwards checks if the inflater requires input (i.e. `Inflater::needsInput()`) or has finished inflating (`Inflater::finished()`). This leads to an unnecessary native call to `inflate()` when `InflaterInputStream::read()` is invoked for the very first time because `Inflater::fill()` hasn't been called and another unnecessary native call to detect the end of the stream. For small streams/files which completely fit into the output buffer passed to `InflaterInputStream::read()` we can therefore save two out of three native calls. The micro benchmark in the PR shows that this results in a 5%-10% performance improvement for zip files of sizes between 4096 to 256 bytes (notice that in common jars like Tomcat, Spring-Boot, Maven, Jackson, etc. about 60-80% of the classes are smaller than 4096 bytes).

      ```
      original
      --------
      Benchmark (size) Mode Cnt Score Error Units
      InflaterInputStreams.inflaterInputStreamRead 256 avgt 5 2.571 ± 0.120 us/op
      InflaterInputStreams.inflaterInputStreamRead 512 avgt 5 2.861 ± 0.064 us/op
      InflaterInputStreams.inflaterInputStreamRead 4096 avgt 5 5.110 ± 0.278 us/op

      patched
      -------
      Benchmark (size) Mode Cnt Score Error Units
      InflaterInputStreams.inflaterInputStreamRead 256 avgt 5 2.332 ± 0.081 us/op
      InflaterInputStreams.inflaterInputStreamRead 512 avgt 5 2.691 ± 0.293 us/op
      InflaterInputStreams.inflaterInputStreamRead 4096 avgt 5 4.812 ± 1.038 us/op
      ```

      As a side effect, this change also fixes an issue with alternative zlib implementations like zlib-Chromium or zlib-Cloudflare which pad the inflated bytes with a specific byte pattern at the end of the output for debugging purpose. This breaks code patterns like the following:

      ```java
      int readcount = 0;
      while ((bytesRead = inflaterInputStream.read(data, 0, bufferSize)) != -1) {
          outputStream.write(data, 0, bytesRead);
          readCount++;
      }
      if (readCount == 1) {
          return data; // <---- first bytes might be overwritten
      }
      return outputStream.toByteArray();
      ```

      Even if the whole data fits into the `data` array during the first call to `inflaterInputStream.read()`, we still need a second call to `inflaterInputStream.read()` in order to detect the end of the inflater stream. If this second call calls into the native `inflate()` function of Cloudflare/Chromium's zlib version, this will still write some padding bytes at the beginning of the `data` array and overwrite the previously read data. This issue has been reported in Spring [1] and ASM [2] when using these libraries with Cloudflares zlib version (by setting `LD_LIBRARY_PATH` accordingly).

      [1] https://github.com/spring-projects/spring-framework/issues/27429
      [2] https://gitlab.ow2.org/asm/asm/-/issues/317955

      Attachments

        Issue Links

          Activity

            People

              simonis Volker Simonis
              simonis Volker Simonis
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: