-
CSR
-
Resolution: Approved
-
P4
-
None
-
behavioral
-
low
-
-
Java API
-
SE
Summary
Refine ZipOutputStream.putNextEntry(ZipEntry) to recalculate ZipEntry's compressed size when ZipEntry.setCompressedSize() has not been explicitly called on the ZipEntry.
Problem
In general it is not safe to directly write a ZipEntry obtained from ZipInputStream.getNextEntry(), ZipFile.entries(), ZipFile.getEntry() or ZipFile.stream() with ZipOutputStream.putNextEntry() to a ZipOutputStream and then read the entries data from the ZipInputStream and write it to the ZipOutputStream as follows:
ZipEntry entry;
ZipInputStream zis = new ZipInputStream(...);
ZipOutputStream zos = new ZipOutputStream(...);
while((entry = zis.getNextEntry()) != null) {
zos.putNextEntry(entry);
zis.transferTo(zos);
}
The problem with this code is that the ZIP file format does not record the compression level used for deflation in its entries. In general, it doesn't even mandate a predefined compression ratio per compression level. Therefore the compressed size recorded in a ZipEntry read from a ZIP file might differ from the new compressed size produced by the receiving ZipOutputStream. Such a difference will result in a ZipException with the following message:
java.util.zip.ZipException: invalid entry compressed size (expected 12 but got 7 bytes)
Unfortunately, there are examples where code which follows this coding pattern: Searching for "java.util.zip.ZipException: invalid entry compressed size (expected 12 but got 7 bytes)" gives ~2500 hits (~100 on StackOverflow). It's also no hard to find plenty of instances of this anti-pattern on GitHub when doing a code search for ZipEntry and putNextEntry(). E.g. Gradle 4.x wrapper task is affected as well as the latest version of the mockableAndroidJar task. I've recently fixed two occurrences of this pattern in OpenJDK (see JDK-8240333 and JDK-8240235) but there still exist more of them (e.g. test/jdk/java/util/zip/ZipFile/CopyJar.java which is there since 1999 :).
Solution
In ZipOutputStream.putNextEntry(ZipEntry) ignore and recalculate ZipEntry's compressed size when ZipEntry.setCompressedSize() has not been explicitly called on that ZipEntry.
Specification
Changes to the JarOutputStream API doc:
--- a/src/java.base/share/classes/java/util/jar/JarOutputStream.java
+++ b/src/java.base/share/classes/java/util/jar/JarOutputStream.java
@@ -76,8 +76,15 @@ public class JarOutputStream extends ZipOutputStream {
/**
* Begins writing a new JAR file entry and positions the stream
* to the start of the entry data. This method will also close
- * any previous entry. The default compression method will be
- * used if no compression method was specified for the entry.
+ * any previous entry.
+ * <p>
+ * The default compression method will be used if no compression
+ * method was specified for the entry. When writing a compressed
+ * (DEFLATED) entry, and the compressed size has not been explicitly
+ * set with the {@link ZipEntry#setCompressedSize(long)} method,
+ * then the compressed size will be set to the actual compressed
+ * size after deflation.
+ * <p>
* The current time will be used if the entry has no set modification
* time.
*
Changes to the the ZipOutputStream API doc:
--- a/src/java.base/share/classes/java/util/zip/ZipOutputStream.java
+++ b/src/java.base/share/classes/java/util/zip/ZipOutputStream.java
@@ -179,9 +179,15 @@ public class ZipOutputStream extends DeflaterOutputStream implements ZipConstant
/**
* Begins writing a new ZIP file entry and positions the stream to the
* start of the entry data. Closes the current entry if still active.
+ * <p>
* The default compression method will be used if no compression method
- * was specified for the entry, and the current time will be used if
- * the entry has no set modification time.
+ * was specified for the entry. When writing a compressed (DEFLATED)
+ * entry, and the compressed size has not been explicitly set with the
+ * {@link ZipEntry#setCompressedSize(long)} method, then the compressed
+ * size will be set to the actual compressed size after deflation.
+ * <p>
+ * The current time will be used if the entry has no set modification time.
+ *
* @param e the ZIP entry to be written
* @throws ZipException if a ZIP format error has occurred
* @throws IOException if an I/O error has occurred
See https://github.com/openjdk/jdk/pull/520
The complete changes are attached as 0001-8253952-Refine-ZipOutputStream.putNextEntry-to-recal.patch to this CSR.
- csr of
-
JDK-8253952 Refine ZipOutputStream.putNextEntry() to recalculate ZipEntry's compressed size
-
- Resolved
-