-
Bug
-
Resolution: Won't Fix
-
P4
-
8, 11, 16, 17
ADDITIONAL SYSTEM INFORMATION :
macOS 11.2.3
java 16 2021-03-16
Java(TM) SE Runtime Environment (build 16+36-2231)
Java HotSpot(TM) 64-Bit Server VM (build 16+36-2231, mixed mode, sharing)
A DESCRIPTION OF THE PROBLEM :
When writing a streaming .zip file, as ZipOutputStream does, each compressed file is followed by a data descriptor with the file's CRC-32, compressed size, and uncompressed size. This data descriptor has two forms, one with four-byte lengths for the compressed and uncompressed sizes, and one with eight-byte lengths. Which one to use and expect is determined by whether a Zip64 Extended Information Field is present for the file or not. If one is present, then eight-byte lengths are used. Otherwise four-byte lengths are used, per section 4.3.9.2 of the PKWare APPNOTE (https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT).
When the start of the local header for a file occurs in a large .zip file after the 4 GB boundary (offset >= 2^32), a Zip64 Extended Information Field is required for the file in order to represent the offset of the local file header in the central directory header. ZipOutputStream dutifully generates and inserts such an extra field in that case. However if the compressed and uncompressed sizes are less than 4GB, ZipOutputStream incorrectly uses four-byte lengths for the data descriptor.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Compile and run the source code below, writing the System.out output to a zip file. Name it what you like. I named it mal.zip.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
(Read the actual result below first. Not sure why you put expected before actual.)
What is expected is that the zip file comply with the PKWare specification, in particular that the data descriptor for the second file be 24 bytes long, using eight-byte lengths for the compressed and uncompressed sizes, since that file has a Zip64 Extended Information Field.
ACTUAL -
At offset 4294967298 (2^32 + 2) you will find the local header for the second file in the zip file. At offset 4294967401 you will find the central header for the second file. That central header has a Zip64 Extended Information Field containing the offset of the local header (4294967298 or 0x100000002). At offset 4294967336 you will find the data descriptor for the second file. It is 16 bytes long, using four-byte lengths for the compressed and uncompressed sizes.
---------- BEGIN SOURCE ----------
// Generate a large, malformed zip file using Java's ZipOutputStream.
import java.io.*;
import java.util.zip.*;
class malzip {
public static void main(String[] args) {
ZipOutputStream zip = new ZipOutputStream(new BufferedOutputStream(System.out));
zip.setLevel(0);
try {
zip.putNextEntry(new ZipEntry("foo"));
byte[] buf = new byte[87473];
for (int i = 0; i < 49093; i++)
zip.write(buf);
zip.closeEntry();
zip.putNextEntry(new ZipEntry("bar"));
zip.close();
}
catch (IOException e) {
System.err.println("I/O error writing zip file");
}
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
None.
FREQUENCY : always
macOS 11.2.3
java 16 2021-03-16
Java(TM) SE Runtime Environment (build 16+36-2231)
Java HotSpot(TM) 64-Bit Server VM (build 16+36-2231, mixed mode, sharing)
A DESCRIPTION OF THE PROBLEM :
When writing a streaming .zip file, as ZipOutputStream does, each compressed file is followed by a data descriptor with the file's CRC-32, compressed size, and uncompressed size. This data descriptor has two forms, one with four-byte lengths for the compressed and uncompressed sizes, and one with eight-byte lengths. Which one to use and expect is determined by whether a Zip64 Extended Information Field is present for the file or not. If one is present, then eight-byte lengths are used. Otherwise four-byte lengths are used, per section 4.3.9.2 of the PKWare APPNOTE (https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT).
When the start of the local header for a file occurs in a large .zip file after the 4 GB boundary (offset >= 2^32), a Zip64 Extended Information Field is required for the file in order to represent the offset of the local file header in the central directory header. ZipOutputStream dutifully generates and inserts such an extra field in that case. However if the compressed and uncompressed sizes are less than 4GB, ZipOutputStream incorrectly uses four-byte lengths for the data descriptor.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Compile and run the source code below, writing the System.out output to a zip file. Name it what you like. I named it mal.zip.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
(Read the actual result below first. Not sure why you put expected before actual.)
What is expected is that the zip file comply with the PKWare specification, in particular that the data descriptor for the second file be 24 bytes long, using eight-byte lengths for the compressed and uncompressed sizes, since that file has a Zip64 Extended Information Field.
ACTUAL -
At offset 4294967298 (2^32 + 2) you will find the local header for the second file in the zip file. At offset 4294967401 you will find the central header for the second file. That central header has a Zip64 Extended Information Field containing the offset of the local header (4294967298 or 0x100000002). At offset 4294967336 you will find the data descriptor for the second file. It is 16 bytes long, using four-byte lengths for the compressed and uncompressed sizes.
---------- BEGIN SOURCE ----------
// Generate a large, malformed zip file using Java's ZipOutputStream.
import java.io.*;
import java.util.zip.*;
class malzip {
public static void main(String[] args) {
ZipOutputStream zip = new ZipOutputStream(new BufferedOutputStream(System.out));
zip.setLevel(0);
try {
zip.putNextEntry(new ZipEntry("foo"));
byte[] buf = new byte[87473];
for (int i = 0; i < 49093; i++)
zip.write(buf);
zip.closeEntry();
zip.putNextEntry(new ZipEntry("bar"));
zip.close();
}
catch (IOException e) {
System.err.println("I/O error writing zip file");
}
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
None.
FREQUENCY : always