Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8243254

Examine ZipFile slash optimization for non-ASCII compatible charsets

    XMLWordPrintable

Details

    • Enhancement
    • Resolution: Fixed
    • P3
    • 15
    • None
    • core-libs

    Description

      ZipFile.getEntry does optimizations to check for directory entries by adding a '/' to the encoded byte array. JDK-8242959 improved on this optimization, but also raised the question whether the optimization is always valid in all charsets.

      E.g., UTF-16 would encode '/' (2F) as either 2F 00 or 00 2F, which means the hash code would differ and a directory "foo/" potentially not be found when looking up "foo". Further complications arise when/if the directory name ends with a code point that might be encoded so that the final byte is 2F, e.g. \u012F.

      We should consider only doing the low-level optimization when the charset encoding used is known to be ASCII compatible in the sense that 2F will be encoded as single-byte 2F. Since more or less all jar files are assumed to be UTF-8 - which is compatible in this sense - this should have little effect on performance.

      Attachments

        Activity

          People

            redestad Claes Redestad
            redestad Claes Redestad
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: