Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8267650

Better-defined JVM class file validation

XMLWordPrintable

    • Icon: JEP JEP
    • Resolution: Withdrawn
    • Icon: P4 P4
    • None
    • specification
    • None
    • Feature
    • Open
    • vm
    • SE
    • valhalla dash dev at openjdk dot java dot net
    • M
    • S

      Summary

      Update JVMS to more clearly define the requirements and timing of JVM class file validation. Align HotSpot with these rules.

      Goals

      Specification and implementation updates will impact format checking, which occurs during class loading, and verification, which occurs between class loading and class initialization.

      Special attention will be given to the following areas:

      • Distinguishing between validation rules and unenforced recommendations
      • The treatment of method names and descriptors
      • Eliminating the unused ACC_SUPER flag
      • Selective validation of attributes
      • The timing of Code and StackMapTable attribute checks
      • The distinct roles of static and structural constraints
      • Correcting and eliminating redundant verification rules

      Most HotSpot changes will be subtle, as necessary to reconcile differences between the specification and implementation.

      APIs that provide information about loaded classes (such as core reflection and JDI) may also need to make subtle adjustments to their validation behavior.

      Non-Goals

      The changes will not address any anomalies in constant pool resolution or runtime execution—this effort is only concerned with class file validation.

      The use of Prolog rules to specify verification since Java SE 6 can make some parts of JVMS difficult to read, but this JEP will not alter that approach.

      The specifications of APIs that operate on class files, like core reflection, often elide many details about API-specific validation behavior. This JEP will not attempt to fill in those details.

      Motivation

      The Valhalla Project is pursuing significant changes to the Java programming model and the Java Virtual Machine. It anticipates extending the class file format with a number of new opcodes, constant pool entries, descriptor forms, verification types, special methods, and attributes.

      In anticipation of these changes, it will be useful to get the rules for class file verification on solid footing.

      Broadly, the JVM processes class files in stages; at each stage, certain categories of validation rules are enforced.

      • When a class is loaded, the bytes of the class file are parsed, and some basic structural rules are enforced. This is called format checking (JVMS 4.8, 5.3.5). If format checking fails, the class cannot be loaded.

      • Before the class can be initialized, the bytecode of every method is checked for both valid syntax and consistent use of types. This is called verification (JVMS 4.10, 5.4.1). If verification fails, no code in the class can be executed.

      • At some point before a specific instruction is executed, a search for any referenced class, field, or method must be performed. This is called resolution (JVMS 5.4.3, 6.5). If resolution fails, execution of the instruction throws an error.

      • Some class file attributes are interpreted by APIs or tools. (For example, the Signature attribute is interpreted by both javac and core reflection methods like Class.getGenericSuperclass.) These APIs and tools have their own validation rules, which may lead to errors or other exceptional behavior when the API or tool is invoked.

      This JEP is focused on the validation rules enforced by format checking and verification. It also has a subtle impact on the rules some APIs and tools are expected to enforce.

      Historically, the lines between different validation stages were sometimes blurred, and some anomalies persist in the JVM specification. Readers and implementers of the specification may be left with questions such as:

      • Which rules about class, field, and method references actually lead to load-time ClassFormatErrors? For example, how can the JVM know whether a named class is actually an interface, or whether a field exists?

      • What happens if an array type is used in place of a class name in contexts like a class's this_class or a field or method reference's class_index?

      • When are references to the special method names <init> or <clinit> allowed? Under what conditions is it a ClassFormatError to reference one of these names with an inappropriate descriptor?

      • Why are some attributes, like InnerClasses or LocalVariableTable, "optional" but still validated? Under what conditions is an inconsistency in such an attribute considered a load-time ClassFormatError?

      • Which rules about the StackMapTable attribute are enforced during format checking, and which rules are enforced during verification?

      This JEP addresses these and similar questions by carefully reviewing both the specification and longstanding HotSpot behavior, clarifying the specification text where necessary, and reconciling any behavioral differences.

      Description

      This work can be organized into four different areas of focus, as outlined below.

      In addition to the specification and behavioral changes described here, this is an opportunity to review the treatment of format checking and verification in JVMS and the HotSpot implementation code, potentially identifying further discrepancies or unnecessary complexity.

      Format checking

      Chapter 4 of JVMS will be updated to distinguish between assertions that are meant to be enforced as format checks ("The constant_pool entry at that index must be a CONSTANT_Class_info structure") and assertions that are merely informational ("the class_index item should name a class or an array type, not an interface"). The conditions under which predefined attributes are recognized and checked will also be clarified. The ACC_SUPER flag, which has no effect since Java 8, will no longer be specified.

      Two changes to HotSpot behavior with respect to attribute checking will be made:

      • Rejecting class or interface declarations with Module, ModulePackages, or ModuleMainClass attributes, on the basis that if these attributes appear in the attributes table of a ClassFile structure in any appropriately-versioned class file, they should be recognized as predefined attributes, and thus checked.

      • Rejecting non-static field declarations with ConstantValue attributes, similarly on the basis that if the attribute appears in the attributes table of a field_info structure, it should be recognized and checked.

      javac will be updated to no longer set ACC_SUPER.

      Special methods

      To improve consistency of JVMS and align with longstanding HotSpot behavior, the definitions of special methods will be revised to include any methods with the names <init> or <clinit>; a number of special restrictions apply to these method declarations and references to them. The constraints on names and descriptors in references to methods (like Methodref and InvokeDynamic) will be clarified.

      In HotSpot, the following validation behaviors are changed:

      • Unspecified checks on NameAndType constants are no longer performed—for example, the NameAndType <init>:()D is legal, per the specification, even though it cannot be used in a Fieldref or Methodref.

      • The check that an invokedynamic does not use the name <init> with a void-returning descriptor is moved from verification to format checking of the InvokeDynamic constant, for consistency with other similar checks. (For example, other return types are already rejected during format checking.)

      • Enforcing, for all class file version numbers, the requirement that a <clinit> method declaration must have no parameters. (This check is not currently specified or enforced in version 50 and older class files.)

      Optional attributes

      Eleven attributes—most of which are for use by the Java programming language or debuggers—are considered "optional" and have no impact on JVM behavior, but are subject to certain restrictions during format checking. In some cases, the specification makes assertions about these attributes that implementations cannot enforce, leaving the implementations to approximate the desired behavior with ad hoc checks.

      Specifically, the contents of the following optional attributes are currently subject to some format checks:

      • Exceptions
      • InnerClasses
      • EnclosingMethod
      • Synthetic
      • Signature
      • SourceFile
      • LineNumberTable
      • LocalVariableTable
      • LocalVariableTypeTable
      • Deprecated
      • Record

      Meanwhile, JVMS requires that a number of other optional attributes be ignored during format checking. The rationale for distinguishing between the two categories is not clear, and in practice, some checks do end up being performed on these "ignored" attributes.

      For simplicity and improved performance, format checking will be changed to uniformly parse the names and lengths of all optional attributes, but otherwise completely ignore their contents. (Rules related to the existence of the attributes—e.g., that at most one Exceptions attribute is allowed per method—will continue to be enforced.)

      Where HotSpot provides an interface for accessing these attributes (such as via JDI or core reflection), validation errors can be thrown by the API, as necessary, when the API is invoked—sometime after the class is loaded.

      Verification

      The StackMapTable attribute and the exception_table of the Code attribute must be interpreted with respect to the bytecode of the corresponding Code attribute. But because bytecode is not parsed until verification, many specified format checks on StackMapTable and exception_table are, in HotSpot, verification-time checks.

      To resolve this inconsistency, the specification of verification will be updated to formally include all validation of StackMapTable and exception_table contents, and the corresponding format checking assertions will be expressed as recommendations, not rules.

      In addition, the specification will be updated to clarify the relationship between the static and structural constraints on bytecode (JVMS 4.9) and the verification algorithms (JVMS 4.10). Various bugs in the rules for verification by type checking will be fixed, and a number of redundant assertions will be removed (such as the check, already enforced at class loading, that a class's superclass is not final).

      In HotSpot, the following behavioral changes will be made:

      • Moving a few simple checks on the exception_table contents (such as the requirement that each start_pc < end_pc) from format checking to verification time.

      • Changing the error type of verification-time StackMapTable and exception_table errors from ClassFormatError to VerifyError.

      • Treating an invalid Uninitialized_variable_info in a StackMapTable as an unrecoverable static constraint violation, preventing fallback to verification by type inference in version 50 class files. (This aligns its treatment with that of the similar Object_variable_info.)

      • Consistently performing the same verification checks on an invokespecial whether the instruction references a Methodref or an InterfaceMethodref. (Currently, there's an assumption that an interface name will only appear in an InterfaceMethodref.)

      Risks and Assumptions

      Changing JVM validation behavior is often a risk, because it may cause legacy class files to fail with new errors, or, more subtly, new class files with old version numbers to be accepted, but then fail on older JVMs.

      In general, the HotSpot changes proposed in this JEP are narrow in scope, often in corner cases that real world code is unlikely to probe. And many of the changes only modify the type of error being thrown or the timing of an error check. That said, the most likely areas of concern are:

      • New errors caused by improper appearances of the Module, ModulePackages, ModuleMainClass, and ConstantValue attributes.

      • New errors caused by pre-51 class files that declare a useless method with name <clinit> and 1 or more parameters.

      • Accepting class files with malformed optional attributes, even though those class files could fail to load on an older JVM.

      Besides the risk to JVM users, there is some risk that, by relaxing the constraints on optional attributes, downstream tools will be surprised by unvalidated attribute contents in class files that can be successfully loaded.

      These risks need to be balanced against the cost of the extra complexity required to fully specify and maintain longstanding, often ad hoc HotSpot behavior.

      Dependencies

      These changes are a soft prerequisite to new JVM feature work in Valhalla, including JEP 401.

            dlsmith Dan Smith
            dlsmith Dan Smith
            Dan Smith Dan Smith
            Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: