Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8283758

Weaken the InflaterInputStream specification in order to allow faster Zip implementations

    XMLWordPrintable

Details

    • CSR
    • Resolution: Approved
    • P4
    • 20
    • core-libs
    • None
    • behavioral
    • minimal
    • Hide
      There is at least one large cloud company that has been using JDK builds with a "scribbling Zip implementation" in production systems for two years and have not observed any issues.

      Builds of the JDK use the zlib code in the JDK repo or the system zlib. The zlib code in the JDK does not do scribble. We are not aware of any system zlib that scribbles.

      The behavior change that is allowed by the spec changes proposed here is therefore limited to environments where an alternative zlib implementation is used.
      Show
      There is at least one large cloud company that has been using JDK builds with a "scribbling Zip implementation" in production systems for two years and have not observed any issues. Builds of the JDK use the zlib code in the JDK repo or the system zlib. The zlib code in the JDK does not do scribble. We are not aware of any system zlib that scribbles. The behavior change that is allowed by the spec changes proposed here is therefore limited to environments where an alternative zlib implementation is used.
    • Java API
    • SE

    Description

      Summary

      Weaken the specification of InflaterInputStream::read(byte[] b, int off, int len) (and the overridden methods in GZipInputStream, ZipInputStream and JarInputStream) to highlight that contrary to the specification of the InputStream superclass, an implementation is free to modify the elements beyond the last inflated byte in the output buffer b. Add an API note to classes declared to return an InputStream but actually return an instance of InflaterInputStream (or one of its subclasses) to let developers know that the returned input stream might scribble the output buffer.

      Problem

      The superclass java.io.InputStream specifies that read(byte[] b, int off, int len) will leave the content beyond the last read byte in the read buffer b unaffected. However, the overridden read method in InflaterInputStream (and GZipInputStream/ZipInputStream) passes the read buffer b to Inflater::inflate(byte[] b, int off, int len) which doesn't provide this guarantee. Depending on implementation details of the underlying native zlib implementation, Inflater::inflate might write more than the returned number of inflated bytes into the buffer b. Chromiums optimized zlib version is an implementation which is known to do this.

      Solution

      Weaken the specification of InflaterInputStream::read(byte[] b, int off, int len) (and the overridden methods in GZipInputStream, ZipInputStream and JarInputStream) to highlight that contrary to the specification of the InputStream superclass, an implementation is free to modify the elements beyond the last inflated byte in the output buffer b. Add an API note to classes declared to return an InputStream but actually return an instance of InflaterInputStream (or one of its subclasses) to let developers know that the returned input stream might scribble the output buffer.

      Specification

      @@ -124,13 +124,24 @@ public class InflaterInputStream extends FilterInputStream {
           }
      
           /**
      -     * Reads uncompressed data into an array of bytes. If {@code len} is not
      -     * zero, the method will block until some input can be decompressed; otherwise,
      -     * no bytes are read and {@code 0} is returned.
      +     * Reads uncompressed data into an array of bytes, returning the number of inflated
      +     * bytes. If {@code len} is not zero, the method will block until some input can be
      +     * decompressed; otherwise, no bytes are read and {@code 0} is returned.
      +     * <p>
      +     * If this method returns a nonzero integer <i>n</i> then {@code buf[off]}
      +     * through {@code buf[off+}<i>n</i>{@code -1]} contain the uncompressed
      +     * data.  The content of elements {@code buf[off+}<i>n</i>{@code ]} through
      +     * {@code buf[off+}<i>len</i>{@code -1]} is undefined, contrary to the
      +     * specification of the {@link java.io.InputStream InputStream} superclass,
      +     * so an implementation is free to modify these elements during the inflate
      +     * operation. If this method returns {@code -1} or throws an exception then
      +     * the content of {@code buf[off]} through {@code buf[off+}<i>len</i>{@code
      + * -1]} is undefined. + * * @param b the buffer into which the data is read * @param off the start offset in the destination array {@code b} * @param len the maximum number of bytes read - * @return the actual number of bytes read, or -1 if the end of the + * @return the actual number of bytes inflated, or -1 if the end of the * compressed input is reached or a preset dictionary is needed * @throws NullPointerException If {@code b} is {@code null}. * @throws IndexOutOfBoundsException If {@code off} is negative,
      @@ -92,13 +92,24 @@ public class GZIPInputStream extends InflaterInputStream {
           }
      
           /**
      -     * Reads uncompressed data into an array of bytes. If {@code len} is not
      -     * zero, the method will block until some input can be decompressed; otherwise,
      -     * no bytes are read and {@code 0} is returned.
      +     * Reads uncompressed data into an array of bytes, returning the number of inflated
      +     * bytes. If {@code len} is not zero, the method will block until some input can be
      +     * decompressed; otherwise, no bytes are read and {@code 0} is returned.
      +     * <p>
      +     * If this method returns a nonzero integer <i>n</i> then {@code buf[off]}
      +     * through {@code buf[off+}<i>n</i>{@code -1]} contain the uncompressed
      +     * data.  The content of elements {@code buf[off+}<i>n</i>{@code ]} through
      +     * {@code buf[off+}<i>len</i>{@code -1]} is undefined, contrary to the
      +     * specification of the {@link java.io.InputStream InputStream} superclass,
      +     * so an implementation is free to modify these elements during the inflate
      +     * operation. If this method returns {@code -1} or throws an exception then
      +     * the content of {@code buf[off]} through {@code buf[off+}<i>len</i>{@code
      + * -1]} is undefined. + * * @param buf the buffer into which the data is read - * @param off the start offset in the destination array {@code b} + * @param off the start offset in the destination array {@code buf} * @param len the maximum number of bytes read - * @return the actual number of bytes read, or -1 if the end of the + * @return the actual number of bytes inflated, or -1 if the end of the * compressed input stream is reached * * @throws NullPointerException If {@code buf} is {@code null}.
      @@ -165,10 +165,21 @@ public class ZipInputStream extends InflaterInputStream implements ZipConstants
           }
      
           /**
      -     * Reads from the current ZIP entry into an array of bytes.
      -     * If {@code len} is not zero, the method
      -     * blocks until some input is available; otherwise, no
      -     * bytes are read and {@code 0} is returned.
      +     * Reads from the current ZIP entry into an array of bytes, returning the number of
      +     * inflated bytes. If {@code len} is not zero, the method blocks until some input is
      +     * available; otherwise, no bytes are read and {@code 0} is returned.
      +     * <p>
      +     * If the current entry is compressed and this method returns a nonzero
      +     * integer <i>n</i> then {@code buf[off]}
      +     * through {@code buf[off+}<i>n</i>{@code -1]} contain the uncompressed
      +     * data.  The content of elements {@code buf[off+}<i>n</i>{@code ]} through
      +     * {@code buf[off+}<i>len</i>{@code -1]} is undefined, contrary to the
      +     * specification of the {@link java.io.InputStream InputStream} superclass,
      +     * so an implementation is free to modify these elements during the inflate
      +     * operation. If this method returns {@code -1} or throws an exception then
      +     * the content of {@code buf[off]} through {@code buf[off+}<i>len</i>{@code
      + * -1]} is undefined. + * * @param b the buffer into which the data is read * @param off the start offset in the destination array {@code b} * @param len the maximum number of bytes read
      @@ -167,10 +167,21 @@ public class JarInputStream extends ZipInputStream {
           }
      
           /**
      -     * Reads from the current JAR file entry into an array of bytes.
      -     * If {@code len} is not zero, the method
      -     * blocks until some input is available; otherwise, no
      -     * bytes are read and {@code 0} is returned.
      +     * Reads from the current JAR entry into an array of bytes, returning the number of
      +     * inflated bytes. If {@code len} is not zero, the method blocks until some input is
      +     * available; otherwise, no bytes are read and {@code 0} is returned.
      +     * <p>
      +     * If the current entry is compressed and this method returns a nonzero
      +     * integer <i>n</i> then {@code buf[off]}
      +     * through {@code buf[off+}<i>n</i>{@code -1]} contain the uncompressed
      +     * data.  The content of elements {@code buf[off+}<i>n</i>{@code ]} through
      +     * {@code buf[off+}<i>len</i>{@code -1]} is undefined, contrary to the
      +     * specification of the {@link java.io.InputStream InputStream} superclass,
      +     * so an implementation is free to modify these elements during the inflate
      +     * operation. If this method returns {@code -1} or throws an exception then
      +     * the content of {@code buf[off]} through {@code buf[off+}<i>len</i>{@code
      + * -1]} is undefined. + * <p> * If verification has been enabled, any invalid signature * on the current entry will be reported at some point before the * end of the entry is reached.
      @@ -840,6 +840,12 @@ public abstract class URLConnection {
            * returned input stream if the read timeout expires before data
            * is available for read.
            *
      +     * @apiNote The {@code InputStream} returned by this method can wrap an
      +     * {@link java.util.zip.InflaterInputStream InflaterInputStream}, whose
      +     * {@link java.util.zip.InflaterInputStream#read(byte[], int, int)
      +     * read(byte[], int, int)} method can modify any element of the output
      +     * buffer.
      +     *
            * @return     an input stream that reads from this open connection.
            * @throws     IOException              if an I/O error occurs while
            *               creating the input stream.
      @@ -343,6 +343,12 @@ public class ZipFile implements ZipConstants, Closeable {
            * Closing this ZIP file will, in turn, close all input streams that
            * have been returned by invocations of this method.
            *
      +     * @apiNote The {@code InputStream} returned by this method can wrap an
      +     * {@link java.util.zip.InflaterInputStream InflaterInputStream}, whose
      +     * {@link java.util.zip.InflaterInputStream#read(byte[], int, int)
      +     * read(byte[], int, int)} method can modify any element of the output
      +     * buffer.
      +     *
            * @param entry the zip file entry
            * @return the input stream for reading the contents of the specified
            * zip file entry or null if the zip file entry does not exist
      @@ -822,6 +822,13 @@ public class JarFile extends ZipFile {
           /**
            * Returns an input stream for reading the contents of the specified
            * zip file entry.
      +     *
      +     * @apiNote The {@code InputStream} returned by this method can wrap an
      +     * {@link java.util.zip.InflaterInputStream InflaterInputStream}, whose
      +     * {@link java.util.zip.InflaterInputStream#read(byte[], int, int)
      +     * read(byte[], int, int)} method can modify any element of the output
      +     * buffer.
      +     *
            * @param ze the zip file entry
            * @return an input stream for reading the contents of the specified
            *         zip file entry or null if the zip file entry does not exist

      Attachments

        Issue Links

          Activity

            People

              simonis Volker Simonis
              simonis Volker Simonis
              Alan Bateman, Lance Andersen
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: