Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4980042

Cannot use Surrogates in zip file metadata like filenames

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P4 P4
    • 7
    • 5.0
    • core-libs
    • b57
    • generic
    • generic

      java/util/zip/ZipOutputStream.java has an implementation of UTF8 encoding
      that does not take into account surrogates:

      private static byte[] getUTF8Bytes(String s) {
        char[] c = s.toCharArray();
        int len = c.length;
        // Count the number of encoded bytes...
        int count = 0;
        for (int i = 0; i < len; i++) {
            int ch = c[i];
            if (ch <= 0x7f) {
         count++;
            } else if (ch <= 0x7ff) {
         count += 2;
            } else {
         count += 3;
            }
        }
        // Now return the encoded bytes...
        byte[] b = new byte[count];
        int off = 0;
        for (int i = 0; i < len; i++) {
            int ch = c[i];
            if (ch <= 0x7f) {
         b[off++] = (byte)ch;
            } else if (ch <= 0x7ff) {
         b[off++] = (byte)((ch >> 6) | 0xc0);
         b[off++] = (byte)((ch & 0x3f) | 0x80);
            } else {
         b[off++] = (byte)((ch >> 12) | 0xe0);
         b[off++] = (byte)(((ch >> 6) & 0x3f) | 0x80);
         b[off++] = (byte)((ch & 0x3f) | 0x80);
            }
        }
        return b;
      }
      -----------------------------------------------------------
      Also, Norbert Lindenberg noted:

      I did notice another thing that looks fishy:
      src/share/native/java/util/zip/ZipFile.c has calls to the JNI routines
      GetStringUTFLength and GetStringUTFRegion, apparently also to handle
      file names. These are probably wrong, because JNI uses modified UTF-8
      and zip/jar files should use standard UTF-8.

            sherman Xueming Shen
            martin Martin Buchholz
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: