Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8251554 JEP 401: Value Classes and Objects (Preview)
  3. JDK-8317278

JVM implementation of value classes and objects



    • Sub-task
    • Resolution: Unresolved
    • P4
    • tbd
    • None
    • hotspot


      This task summarizes the JVM changes introduced by JEP 401. All features are activated by the `--enable-preview` launcher option.

      ### Class file format

      The `value` modifier is encoded in a `class` file using the `ACC_IDENTITY` (`0x0020`) flag. 0 means "value", 1 means "identity". Interfaces must not have `ACC_IDENTITY` set.

      In older-versioned `class` files, all classes (not interfaces) are considered to have `ACC_IDENTITY` set. (Historically, `0x0020` represented `ACC_SUPER`, and all classes, but not interfaces, were encouraged to set it. The flag is no longer meaningful, but coincidentally will tend to match this implicit behavior.)

      A final instance field that is guaranteed never to mutate after entry to `Object.<init>` is marked `ACC_STRICT` (`0x0800`, used in legacy classes to indicate `strictfp` semantics for methods, but no longer meaningful). Format checking ensures that this flag is only applied along with `ACC_FINAL`, and never with `ACC_STATIC`.

      Format checking fails if a `value` class is neither `abstract` nor `final`, has a non-`final` or non-`strict` instance field, or has a `synchronized` instance method.

      ### Class loading

      At class load time, a value class checks that its superclass (excepting java/lang/Object) is also a value class. If not, the class fails to load.

      When preview features are enabled, the following non-preview classes are considered 'value' classes with 'strict', early-initialized fields (details of finding/generating appropriate class file artifacts are TBD):
      - java.lang.Number
      - java.lang.Record
      - All 8 primitive wrapper classes
      - java.util.Optional

      ### Verification

      Verification prevents a 'putfield' to any strict field, unless operating on 'uninitializedThis'.

      instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame, NextStackFrame, ExceptionStackFrame) :-
          CP = field(FieldClassName, FieldName, FieldDescriptor),
          **\+ currentClassStrictField(FieldClassName, FieldName, FieldDescriptor),**
          parseFieldDescriptor(FieldDescriptor, FieldType),
          canPop(StackFrame, [FieldType], PoppedFrame),
          passesProtectedCheck(Environment, FieldClassName, FieldName, FieldDescriptor, PoppedFrame),
          currentClassLoader(Environment, CurrentLoader),
          canPop(StackFrame, [FieldType, class(FieldClassName, CurrentLoader)], NextStackFrame),
          exceptionStackFrame(StackFrame, ExceptionStackFrame).

      **currentClassStrictField(FieldClassName, FieldName, FieldDescriptor) :-**
          **thisClass(Environment, CurrentClass),**
          **classClassName(CurrentClass, FieldClassName),**
          **classDeclaresStrictField(CurrentClass, FieldName, FieldDescriptor).**

      The second rule for the 'uninitializedThis' case is unchanged.

      Note that if no matching field is declared by the current class, but the resolved field is final, a linkage error is already specified to occur. Verification need only be concerned with fields of the current class.

      ### Preload attribute

      Standard 'L' descriptors are used to refer to value classes, and these classes are typically loaded lazily. To enable optimizations that require earlier knowledge about value class layouts, the 'Preload' attribute of a class indicates that a set of referenced `CONSTANT_Class` entries should be eagerly loaded to locate potentially-useful layout information.

      Preload_attribute {
          u2 attribute_name_index;
          u4 attribute_length;
          u2 number_of_classes;
          u2 classes[number_of_classes];

      There is no specified timing for loading. TBD whether the attribute allows the constant to be *resolved* (and the result cached) rather than just encouraging the class to be *loaded*. No errors are thrown if loading fails, or the given class is not a value class.

      ### Identity-sensitive operations

      - The `if_acmpeq` and `if_acmpne` operations implement the `==` test for value objects, comparing primitive fields by their bit patterns and reference-typed fields recursively with `acmpeq`.

      - The `monitorenter` instruction throws an IllegalMonitorStateException if applied to a value object.

      ### Object encodings

      In *this release* of HotSpot, value objects are encoded as follows. (These details are subject to change in future releases.)

      - In the interpreter and C1, value objects on the stack are encoded as regular heap objects.

      - In C2, value objects on the stack are typically scalarized when stored or passed with concrete value class types. Scalarization represents each field as a separate variable, with an additional variable for metadata (modeling 'null' and caching a heap-allocated copy of the object). Methods with value-class-typed parameters support both a pointer-based entry point (for interpreter and C1 calls) and a scalarized entry point (for C2-to-C2 calls). Value objects are allocated on the heap when they need to be viewed as values of a supertype of the value class.

      - (Potentially) In fields and arrays with a concrete value class type, for 64-bit builds of HotSpot, the variable stores a 64-bit word encoding a value object's field values and a boolean `null` flag. If the variable has a polymorphic type, or the field values cannot fit in this encoding, the variable stores a regular heap object pointer. (TBD whether a 32-bit build supports 32-bit flattening or just uses heap objects exclusively.)

      - (Potentially) In strict final fields with a concrete value class type, where the class is too large to fit in an atomic encoding but within some reasonable threshold, the fields of the object are flattened individually rather than falling back to a heap object.

      At the boundaries between components, conversions between encodings may occur. This includes dynamic conversions that may occur when reading/writing to a possibly-flattened array, and when invoking a possibly-scalarized method.

      Optimizations rely on the `Preload` attribute to identify value class types at preparation time. If a value class is not named by `Preload` (for example, if the class was an identity class at compile time), fields and methods may end up using a heap object encoding instead. In the case of a method overriding mismatch—a method and its super methods disagree about scalarization of a particular type—the overriding method may dynamically force callers to de-opt and use the pointer-based entry point.

      To facilitate the special behavior of instructions like `if_acmpeq`, value objects in the heap are identified with a new flag in their object header.

      ### Other JVM tools and APIs

      - 'javap' displays the 'identity' (perhaps translated to 'value') and 'strict' modifiers, and recognizes the Preload attribute

      - JEP 457: Class-File API (Preview) may need updates to support the new modifiers and the Preload attribute.




            dlsmith Dan Smith
            dlsmith Dan Smith
            1 Vote for this issue
            4 Start watching this issue