Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: P4
Fix Version/s: tbd
Affects Version/s: 7
Component/s: core-libs
Labels:
None

Subcomponent:
java.util.jar
Understanding:
In Review
CPU:

generic
OS:

generic

GZIPInputStream supports reading data from multiple concatenated GZIP data streams since ~~JDK-4691425~~. In order to do this, after the trailer of a stream is read, it attempts to read the header of the next stream, and if successful, proceeds onward, and if the attempt fails, it just ignores the trailing garbage and returns end-of-data.

There are several issues with this:

1. The behaviors of (a) supporting concatenated streams and (b) ignoring trailing garbage are not documented, much less precisely specified.

2. Ignoring trailing garbage is dubious because it could easily hide errors or other data corruption that an application would rather be notified about. Moreover, the API claims that a ZipException will be thrown when corrupt data is read, but obviously that doesn't happen in the trailing garbage scenario.

3. There's no way to create a GZIPInputStream that does NOT support stream concatenation. For example, an application that wanted to send multiple sequential compressed streams over a single underlying stream and read them out one at a time might want to operate in this mode.

See this github comment for a history of this class: https://github.com/openjdk/jdk/pull/17113#issuecomment-1859177655

Suggestion:

- Add new method setEnableConcatenatedStreams(boolean), default true
- When concatenated streams disabled, stop after reading a stream trailer
- When concatenated streams enabled, throw ZipException if there is any data after a trailer but it cannot be successfully interpreted as a next header

From a backward-compatibility point of view, those changes would give the current behavior except now bogus trailing garbage would generate a ZipException instead of being discarded. For more perfect backward compatibility, there could be another knob setIgnoreTrailingGarbage(boolean).

csr for

JDK-8330195 Define and document GZIPInputStream concatenated stream semantics

Draft

relates to

JDK-4691425 GZIPInputStream fails to read concatenated .gz files

Closed

JDK-7036144 GZIPInputStream readTrailer uses faulty available() test for end-of-stream

Closed

links to

Review(master) openjdk/jdk/18385

Review(master) openjdk/jdk/20787

Assignee:: Eirik Bjørsnøs
Reporter:: Archie Cobbs
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: 2023-12-17 07:59
Updated:: 2024-11-25 10:52

Details

Description

Attachments

Issue Links

Activity

People

Dates