Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8355940

Improve jar --validate to detect duplicate or invalid entries

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Approved
    • Icon: P4 P4
    • 25
    • tools
    • None
    • jar
    • behavioral
    • minimal
    • Hide
      No changes have been done to the Java runtime. The --validate option of the jar tool is enhanced to warn of additional inconsistencies in the JAR file. For some JAR files this may cause the "jar --validate" command to generate additional warning messages and exit with a different code than previously. The compatibility impact of this change is minimal.
      Show
      No changes have been done to the Java runtime. The --validate option of the jar tool is enhanced to warn of additional inconsistencies in the JAR file. For some JAR files this may cause the "jar --validate" command to generate additional warning messages and exit with a different code than previously. The compatibility impact of this change is minimal.
    • add/remove/modify command line option
    • JDK

      Summary

      Enhance the --validate option of the jar tool to report duplicate entries, mismatched order between central directory and local file headers, and entry names that don't comply with the ZIP specification.

      Problem

      The ZIP specification has been around for 30 years. The latest specification is located at https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT. Over time the specification has undergone changes and in some places the specification has become stricter.

      A JAR file is a ZIP file with some additional expectations of the contained entries. Java SE provides several APIs to work with ZIP files and JAR files. Given how prominent and common the ZIP format is, there are several tools and APIs within and external to the Java ecosystem that allow for constructing ZIP and JAR files. Depending on which tool is used to create the JAR file, it could lead to inconsistencies in how the underlying ZIP structure is interepreted by the Java ecosystem. Using such JAR files can then lead to unspecified behaviour in various part of the Java runtime.

      Following are some potential issues in a JAR file:

      Duplicate entry names - Although the ZipOutputStream class does not allow creating a ZIP/JAR file with duplicate entry names, the ZIP specification does allow it.

      Entry names that don't comply with the ZIP specification - In recent updates of the ZIP specification, section 4.4.17.1 was changed to be stricter about entry names:

      4.4.17.1 The name of the file, with optional relative path. The path stored MUST NOT contain a drive or device letter, or a leading slash. All slashes MUST be forward slashes '/' as opposed to backwards slashes '\' for compatibility with Amiga and UNIX file systems etc. If input came from standard input, there is no file name field.

      Prior to APPNOTE.txt version 6.3.3, the wording was "should not". The ZipOutputStream class and several others tools within the ecosystem don't enforce this entry name specification and thus there are JAR files in the ecosystem which may not match this expectation.

      Additionally, since JDK17, the ZIP Filesystem specification (https://docs.oracle.com/en/java/javase/24/docs/api/jdk.zipfs/module-summary.html#accessing-a-zip-file-system-heading ) does not support entries with "." or ".." in its name elements.

      Missing LOC or CEN entries - The ZIP structure consists of a local header (henceforth refered to as LOC) and a central record (henceforth refered to as CEN) for each entry. Section 4.3.2 of the specification states:

      4.3.2 Each file placed into a ZIP file MUST be preceded by a "local file header" record for that file. Each "local file header" MUST be accompanied by a corresponding "central directory header" record within the central directory section of the ZIP file.

      The ZipOutputStream class of Java SE ensures that it follows this specification when generating the ZIP/JAR file. Other tools in the ecosystem may not. The ZipInputStream, JarInputStream, ZipFile and JarFile APIs do not verify that the ZIP/JAR that they are consuming comply with this specification.

      Identifying such issues with a JAR file will help applications to address those issues and thus prevent unspecified behaviour when such JAR files are used in the Java runtime.

      The JDK ships the jar tool. Since Java 17, through JDK-8266835, the jar tool has had the --validate option. As the name states, this option is used to validate the contents of a JAR file and report any problems identified in the JAR file. The --validate option of the jar tool provides us the opportunity to implement additional validations in the jar tool to detect and report these issues.

      Solution

      The jar --validate command is enhanced to identify and report the following issues in a JAR file:

      • presence of more than one entry with the same name
      • presence of entries that have characters or placement of characters in their name that don't comply with the ZIP specification
      • inconsistencies in the LOC listing or ordering of entries as compared to the CEN of the ZIP file

      Specification

      Usage remains the same. In case jar --validate command detected any integrity issues, warning messages will be reported to System.err and the tool will exit with status >0.

      Let's take a jar with following content,

      META-INF/MANIFEST.MF
      META-INF/AANIFEST.MF
      entry1.txt
      META-INF/BANIFEST.MF
      entry2.txt

      Assuming we modified central directory to have following,

      META-INF/MANIFEST.MF
      META-INF/MANIFEST.MF
      entry1.txt
      META-INF/MANIFEST.MF
      entry2.txt

      Then jar --validate on that jar file would produce

      Warning: There were 3 central directory entries found for META-INF/MANIFEST.MF
      Warning: An equivalent entry for the local file header META-INF/AANIFEST.MF was not found in the central directory
      Warning: An equivalent entry for the local file header META-INF/BANIFEST.MF was not found in the central directory

      Assuming we modified the local file headers instead, then the output would be

      Warning: There were 3 local file headers found for META-INF/MANIFEST.MF
      Warning: An equivalent for the central directory entry META-INF/AANIFEST.MF was not found in the local file headers
      Warning: An equivalent for the central directory entry META-INF/BANIFEST.MF was not found in the local file headers
      Warning: Central directory and local file header entries are not in the same order

      Note the ordering warning message, this is because the expected order from central directory is to have AANIFEST.MF but see the entry1.txt first instead. The jar --list would have output like following

      META-INF/MANIFEST.MF
      META-INF/AANIFEST.MF
      entry1.txt
      META-INF/BANIFEST.MF
      entry2.txt

      Assuming we modified the local file headers by change the order of AANIFEST.MF and BANIFEST.MF, the output would be

      Warning: Central directory and local file header entries are not in the same order

      An entry name is considered invalid when

      • contains a drive or device letter,
      • contains a leading slash
      • contains backwards slashes '\'
      • the file name or any path element is "." or ".."

      If an entry with invalid name, there would be a warning as

      Warning: entry name $ENTRYNAME is not valid

      For example, a jar file contains an entry ..\\..\\c:\\d:\\tmp\\testentry1 would produce a warning message as following:

      Warning: entry name ..\\..\\c:\\d:\\tmp\\testentry1 is not valid

      The jar --help will have following change for --validate:

            --validate             Validate the contents of the jar archive. This option:
                                   - Validates that the API exported by a multi-release
                                   jar archive is consistent across all different release
                                   versions.
                                   - Issues a warning if there are invalid or duplicate file names

      The jar.md will have following changes:

      --- a/src/jdk.jartool/share/man/jar.md
      +++ b/src/jdk.jartool/share/man/jar.md
      @@ -1,5 +1,5 @@
       ---
      -# Copyright (c) 1997, 2024, Oracle and/or its affiliates. All rights reserved.
      +# Copyright (c) 1997, 2025, Oracle and/or its affiliates. All rights reserved.
       # DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
       #
       # This code is free software; you can redistribute it and/or modify it
      @@ -106,6 +106,10 @@ ## Main Operation Modes
       `-d` or `--describe-module`
       :   Prints the module descriptor or automatic module name.
      
      +`--validate`
      +:   Validate the contents of the JAR file.
      +    See `Integrity of a JAR File` section below for more details.
      +
       ## Operation Modifiers Valid in Any Mode
      
       You can use the following options to customize the actions of any operation
      @@ -213,6 +217,26 @@ ## Other Options
       `--version`
       :   Prints the program version.
      
      +## Integrity of a JAR File
      +As a JAR file is based on ZIP format, it is possible to create a JAR file using tools
      +other than the `jar` command. The --validate option may be used to perform the following
      +integrity checks against a JAR file:
      +
      +- That there are no duplicate Zip entry file names
      +- Verify that the Zip entry file name:
      +    - is not an absolute path
      +    - the file name is not '.' or '..'
      +    - does not contain a backslash, '\\'
      +    - does not contain a drive letter
      +    - path element does not include '.' or '..
      +- The API exported by a multi-release jar archive is consistent across all different release
      +  versions.
      +
      +The jar tool exits with a status of 0 if there were no integrity issues encountered and >0 if an
      +error/warning occurred.
      +
      +When an integrity issue is reported, it will often require that the JAR file is re-created by the
      +original source of the JAR file.
      +
       ## Examples of jar Command Syntax
      
       -   Create an archive, `classes.jar`, that contains two class files,

            henryjen Henry Jen
            henryjen Henry Jen
            Jaikiran Pai, Lance Andersen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: