Loading...

XML

Word

Printable

Type: CSR
Resolution: Approved
Priority: P4
Fix Version/s: 23
Component/s: core-libs
Labels:
None

Subcomponent:
java.util.jar
Compatibility Kind:

behavioral
Compatibility Risk:
low
Compatibility Risk Description:

Hide
This change makes the GZIPInputStream more likely to try to read into a concatenated stream than it did before. In theory, there can be implementations of `java.io.InputStream` which may have returned 0 from their implementation of available() to specifically prevent GZIPInputStream from reading the concatenated GZIP stream. But it's hard to imagine such implementations. Plus such implementations would be relying on an unspecified internal implementation detail.

Show
This change makes the GZIPInputStream more likely to try to read into a concatenated stream than it did before. In theory, there can be implementations of `java.io.InputStream` which may have returned 0 from their implementation of available() to specifically prevent GZIPInputStream from reading the concatenated GZIP stream. But it's hard to imagine such implementations. Plus such implementations would be relying on an unspecified internal implementation detail.
Interface Kind:

Java API
Scope:
Implementation

Summary

Update java.util.zip.GZIPInputStream so it doesn't rely on java.io.InputStream.available() method to decide whether or not to read a concatenated GZIP stream from the underlying input stream.

Problem

The GZIPInputStream class takes an InputStream to read compressed GZIP data from. GZIP format allows for multiple GZIP streams to be concatenated. An undocumented feature of the implementation in GZIPInputStream is that it supports reading such concatenated GZIP streams. This is possible because the GZIP format defines a 8 byte trailer representing the end of an individual GZIP stream.

GZIPInputStream has a public read(byte[] buf, int off, int len) method which returns the uncompressed data after reading from the underlying, possibly concatenated GZIP streams. The current implementation of this method after having read an 8 byte trailer in the underlying stream, calls the java.io.InputStream.available() method on the underlying stream to decide whether or not there's a subsequent concatenated GZIP stream data. If the available() method call returns 0 then the implementation in GZIPInputStream.read() does not read any additional data and marks the GZIPInputStream as having reached the end of compressed input stream. Any subsequent calls to read() will return -1 indicating the end of stream.

Relying on the return value of InputStream.available() method is not appropriate since the InputStream.available() as per its API javadoc states that the return value is merely an estimate of the number of bytes available. That method's API javadoc further states:

Note that while some implementations of {@code InputStream} will return the total number of bytes in the stream, many will not.

As a result, the current implementation of GZIPInputStream.read() which relies on the underlying InputStream's available() method can incorrectly consider the GZIP stream to have reached end of stream even when there may be a concatenated GZIP stream. This results in the GZIPInputStream.read() ignoring and thus not returning possibly additional uncompressed data of underlying GZIP streams.

Solution

The GZIPInputStream.read() will be updated to remove the check on InputStream.available(). The implementation, after reading a 8 byte GZIP stream trailer, will now attempt to read a GZIP stream header from the underlying input stream. If the additional read()s on the underlying input stream return enough bytes and those bytes represent a GZIP stream header, then the GZIPInputStream.read() method will consider that there is a concatenated GZIP stream and it will continue to return the uncompressed data even from the concatenated stream. If however, the read()s on the underlying input stream don't return enough bytes or the returned bytes don't represent a GZIP stream header, then the GZIPInputStream will be marked as having reached the end of compressed input stream.

Specification

There are no specification changes.

csr of

JDK-7036144 GZIPInputStream readTrailer uses faulty available() test for end-of-stream

Closed

relates to

JDK-8340729 GZIPInputStream readTrailer uses faulty available() test for end-of-stream

Closed

Assignee:: Archie Cobbs
Reporter:: Webbug Group
Reviewed By:: Jaikiran Pai, Roger Riggs
Votes:: 0 Vote for this issue
Watchers:: 5 Start watching this issue

Created:: 2024-03-06 09:55
Updated:: 2024-09-26 21:41
Resolved:: 2024-03-08 15:16

Details

Description

Summary

Problem

Solution

Specification

Attachments

Issue Links

Activity

People

Dates