Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8322256

Define and document GZIPInputStream concatenated stream semantics

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: P4 P4
    • 24
    • 7
    • core-libs
    • None
    • In Review
    • generic
    • generic

      GZIPInputStream supports reading data from multiple concatenated GZIP data streams since JDK-4691425. In order to do this, after the trailer of a stream is read, it attempts to read the header of the next stream, and if successful, proceeds onward, and if the attempt fails, it just ignores the trailing garbage and returns end-of-data.

      There are several issues with this:

      1. The behaviors of (a) supporting concatenated streams and (b) ignoring trailing garbage are not documented, much less precisely specified.

      2. Ignoring trailing garbage is dubious because it could easily hide errors or other data corruption that an application would rather be notified about. Moreover, the API claims that a ZipException will be thrown when corrupt data is read, but obviously that doesn't happen in the trailing garbage scenario.

      3. There's no way to create a GZIPInputStream that does NOT support stream concatenation. For example, an application that wanted to send multiple sequential compressed streams over a single underlying stream and read them out one at a time might want to operate in this mode.

      See this github comment for a history of this class: https://github.com/openjdk/jdk/pull/17113#issuecomment-1859177655

      Suggestion:

      - Add new method setEnableConcatenatedStreams(boolean), default true
      - When concatenated streams disabled, stop after reading a stream trailer
      - When concatenated streams enabled, throw ZipException if there is any data after a trailer but it cannot be successfully interpreted as a next header

      From a backward-compatibility point of view, those changes would give the current behavior except now bogus trailing garbage would generate a ZipException instead of being discarded. For more perfect backward compatibility, there could be another knob setIgnoreTrailingGarbage(boolean).

            eirbjo Eirik Bjørsnøs
            acobbs Archie Cobbs
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: