Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8226530

ZipFile reads wrong entry size from ZIP64 entries

XMLWordPrintable

    • b99
    • 9
    • b10

        There is a regression in ZipFile support caused by JDK-8145260.

        The simplest reproducer is this:

        import java.io.*;
        import java.util.*;
        import java.util.zip.*;

        public class ZipFileTest {
          public static void main(String[] args) throws Exception {
            File file = new File(args[0]);

            System.out.println("Trying with ZipInputStream:");
            try (FileInputStream fis = new FileInputStream(file);
              ZipInputStream zis = new ZipInputStream(fis)) {
              ZipEntry entry;
              while ((entry = zis.getNextEntry()) != null) {
                System.out.println(entry.getName() + ": " + entry.getSize());
              }
            }

            System.out.println("Trying with ZipFile:");
            try (ZipFile zip = new ZipFile(file)) {
              Enumeration<? extends ZipEntry> entries = zip.entries();
              while (entries.hasMoreElements()) {
                ZipEntry entry = entries.nextElement();
                System.out.println(entry.getName() + ": " + entry.getSize());
              }
            }
          }
        }

        This is how you run it:
         $ dd if=/dev/zero of=bigfile bs=1K count=5M
         $ dd if=/dev/zero of=smallfile bs=1K count=1K
         $ zip -1 test.zip bigfile smallfile
         $ javac ZipFileTest.java
         $ java ZipFileTest test.zip

        Trying with ZipInputStream:
        bigfile: 5368709120
        smallfile: 1048576
        Trying with ZipFile:
        bigfile: 4294967295 <--- bad!
        smallfile: 1048576

        That 4294967295 is actually ZIP64_MAGICVAL. ZipFile.getEntry does not handle it, in contrast to ZipInputStream that reaches for extended attributes to get the real size.

        Remarkably, the internal ZipFileInputStream inside ZipFile.java does handle ZIP64 right, so this hack works:

        diff -r 12e8433e2581 src/java.base/share/classes/java/util/zip/ZipFile.java
        --- a/src/java.base/share/classes/java/util/zip/ZipFile.java Thu Jun 20 08:02:41 2019 +0000
        +++ b/src/java.base/share/classes/java/util/zip/ZipFile.java Thu Jun 20 19:55:16 2019 +0200
        @@ -681,10 +681,15 @@
                         e.comment = zc.toStringUTF8(cen, start, clen);
                     } else {
                         e.comment = zc.toString(cen, start, clen);
                     }
                 }
        +
        + // Hack: ZipFileInputStream knows how to deal with ZIP64.
        + ZipFileInputStream zfis = new ZipFileInputStream(cen, pos);
        + e.size = zfis.size;
        +
                 lastEntryName = e.name;
                 lastEntryPos = pos;
                 return e;
             }
         
        $ java ZipFileTest test.zip
        Trying with ZipInputStream:
        bigfile: 5368709120
        smallfile: 1048576
        Trying with ZipFile:
        bigfile: 5368709120 <--- good!
        smallfile: 1048576


        We have observed it with smaller archives as well. The bug requires zip entries to have sizes recorded with ZIP64 extensions. Linux zip seems to generate old 4-byte size for entries that fit. This is explicitly allowed by spec:

              4.3.9.2 When compressing files, compressed and uncompressed sizes
              SHOULD be stored in ZIP64 format (as 8 byte values) when a
              file's size exceeds 0xFFFFFFFF. However ZIP64 format MAY be
              used regardless of the size of a file. When extracting, if
              the zip64 extended information extra field is present for
              the file the compressed and uncompressed sizes will be 8
              byte values.

              lancea Lance Andersen
              shade Aleksey Shipilev
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: