Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8251554 JEP 401: Value Classes and Objects (Preview)
  3. JDK-8317278

JVM implementation of value classes and objects

    XMLWordPrintable

Details

    • Sub-task
    • Resolution: Unresolved
    • P4
    • tbd
    • None
    • hotspot

    Description

      This task summarizes the JVM changes introduced by JEP 401. All features are activated by the `--enable-preview` launcher option.

      See full JVMS changes by following the "JVM changes" link at: https://cr.openjdk.org/~dlsmith/jep401/latest/

      ### Class file format

      The `value` modifier is encoded in a `class` file using the `ACC_IDENTITY` (`0x0020`) flag. 0 means "value", 1 means "identity". Interfaces must not have `ACC_IDENTITY` set.

      In non-preview-versioned `class` files, all classes (not interfaces) are considered to have `ACC_IDENTITY` set. (Historically, `0x0020` represented `ACC_SUPER`, and all classes, but not interfaces, were encouraged to set it. The flag is no longer meaningful, but coincidentally will tend to match this implicit behavior.)

      A final instance field that is guaranteed never to mutate after entry to `Object.<init>` is marked `ACC_STRICT` (`0x0800`, used in legacy classes to indicate `strictfp` semantics for methods, but no longer meaningful). Format checking ensures that this flag is only applied along with `ACC_FINAL`, and never with `ACC_STATIC`.

      Format checking fails if a `value` class is neither `abstract` nor `final`, has a non-`final` or non-`strict` instance field, or has a `synchronized` instance method.



      ### Class loading

      At class load time, a value class must have a superclass that is either another value class or java/lang/Object. If not, the class fails to load.

      When preview features are enabled, some designated standard library classes in java.base (see JDK-8317279) are loaded from a separate location, allowing them to be compiled with a 65535 minor version number and use the appropriate features.



      ### Verification

      Verification prevents a 'putfield' to any strict field, unless operating on 'uninitializedThis'.

      instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame, NextStackFrame, ExceptionStackFrame) :-
          CP = field(FieldClassName, FieldName, FieldDescriptor),
          **\+ currentClassStrictField(FieldClassName, FieldName, FieldDescriptor),**
          parseFieldDescriptor(FieldDescriptor, FieldType),
          canPop(StackFrame, [FieldType], PoppedFrame),
          passesProtectedCheck(Environment, FieldClassName, FieldName, FieldDescriptor, PoppedFrame),
          currentClassLoader(Environment, CurrentLoader),
          canPop(StackFrame, [FieldType, class(FieldClassName, CurrentLoader)], NextStackFrame),
          exceptionStackFrame(StackFrame, ExceptionStackFrame).

      **currentClassStrictField(FieldClassName, FieldName, FieldDescriptor) :-**
          **thisClass(Environment, CurrentClass),**
          **classClassName(CurrentClass, FieldClassName),**
          **classDeclaresStrictField(CurrentClass, FieldName, FieldDescriptor).**

      The second rule for the 'uninitializedThis' case is unchanged.

      Note that if no matching field is declared by the current class, but the resolved field is final, a linkage error is already specified to occur. Verification need only be concerned with fields of the current class.



      ### LoadableDescriptors attribute

      Standard 'L' descriptors are used to refer to value classes, and these classes are typically loaded lazily, if at all. To enable optimizations that require earlier knowledge about value class layouts, the 'LoadableDescriptors' attribute of a class indicates that a set of referenced field descriptors (encoded as Utf8 constants) name classes that should be eagerly loaded to locate potentially-useful layout information.

      ```
      LoadableDescriptors_attribute {
          u2 attribute_name_index;
          u4 attribute_length;
          u2 number_of_descriptors;
          u2 descriptors[number_of_descriptors];
      }
      ```

      While mentioned classes are currently intended to be value classes, this is not an enforced requirement. There may be other applications of LoadableDescriptors in the future.

      During loading of the referencing class, the classes mentioned by LoadableDescriptors may be "speculatively loaded". This means that a recursive attempt may be made to load the mentioned class, but it may fail due to circularities (e.g., two classes mutually referencing each other in 'LoadableDescriptors'), and if so the JVM ignores the error; another attempt to load the class later might succeed.

      The classes mentioned by LoadableDescriptors are also allowed to be loaded during any phase of linking; any resulting errors must be discarded.


      ### Identity-sensitive operations

      - The `if_acmpeq` and `if_acmpne` operations implement the `==` test for value objects, comparing primitive fields by their bit patterns and reference-typed fields recursively with `acmpeq`. (The implementation is provided by java.lang.runtime.ValueObjectMethods.)

      - The `monitorenter` instruction throws an IdentityException if applied to a value object.



      ### Object encodings on stack

      In this release of HotSpot, value object references on the stack are encoded as follows. (These details are subject to change in future releases.)

      In the interpreter and C1, value object references on the stack are pointers to regular heap objects.

      In C2, value object references on the stack are typically scalarized when stored or passed with concrete value class types (according to the type inferred by C2, which may be more specific than the verification type). Scalarization represents each field as a separate variable, with an additional variable for metadata (modeling 'null' and caching a heap-allocated copy of the object). If a field has a concrete value class type, it is recursively scalarized. Value objects are always allocated on the heap when they need to be viewed as values of a supertype of the value class type.

      Methods with value-class-typed parameters support both a pointer-based entry point (for interpreter and C1 calls) and a scalarized entry point (for C2-to-C2 calls). These entry points are determined at preparation time, based on the `LoadableDescriptors` attribute. If a value class is not named by `LoadableDescriptors` (for example, if the class was an identity class at compile time), entry points may end up using a heap object encoding instead. In the case of a method overriding mismatch—a method and its super methods disagree about scalarization of a particular type—the overriding method may dynamically force callers to de-opt and use the pointer-based entry point.

      Reads from and writes to heap storage are responsible for converting the heap-encoded data (covered below) to the appropriate scalarized form. For example, a read may require allocating a new heap object. These conversions may be dynamic, such as when reading from/writing to a possibly-flattened array.



      ### Object encodings on heap

      To facilitate the special behavior of instructions like `if_acmpeq`, heap-allocated value objects are identified as such with a new flag in their object header.

      In this release of HotSpot, fields and array components of a concrete value class type will often use flattened storage, with value object references encoded as follows (these details are subject to change in future releases):

      - In the standard case, an extra byte field (set to '1' to represent "non-null") is added to the object's field layout. If the resulting field layout can fit in 8, 16, 32, or 64 bits, those bits are used as the object's flattened encoding. The 'null' reference is encoded as a zero of the appropriate size.

      - (To do: use a different strategy for ACC_STRICT fields?)

      - For cases in which other strategies do not work, the field or array component uses the normal reference encoding, a pointer to a heap-allocated object.

      Field layouts are determined during class loading, with field types mentioned by `LoadableDescriptors` being speculatively loaded as needed to determine the layout.



      ### Null-restricted storage

      For internal testing purposes, this release supports some internal JDK APIs that enable additional heap flattening techniques. Because these behaviors are not specified by Java SE, these APIs should only be used by internal JDK code for experimental purposes and should not affect user-observable outcomes.

      Two annotations can be applied to value classes:

      - `jdk.internal.vm.annotation.ImplicitlyConstructible` marks a value class willing to permit implicit creation of a *zero instance* by HotSpot. An instance of the class may be created without invoking a constructor, with all fields set to their default values.

      - `jdk.internal.vm.annotation.LooselyConsistentValue` marks a value class willing to tolerate data corruption caused by races. In a race condition, new instances of the class may be created from arbitrary combinations of existing instances' field values, without invoking a constructor.

      A third annotation, `jdk.internal.vm.annotation.NullRestricted`, can be applied to fields with an `@ImplicitlyConstructible` value class type. This marks the field as never storing `null`. Its initial value is the zero instance, and attempted `null` writes throw a NullPointerException.

      The method `jdk.internal.misc.VM.newNullRestrictedArray` supports the creation of arrays with components that behave like `@NullRestricted` fields.

      At class load time for an `@ImplicitlyConstructible` class, to enable the creation of zero instances, the class of any `@NullRestricted` instance field is loaded and confirmed to also be an `@ImplicitlyConstructible` value class; an error occurs if this recursively requires loading the enclosing class. Other `@NullRestricted` fields are validated later, such as during preparation.

      Null-restricted heap storage can be flattened without needing an extra byte to encode 'null'; if the value class is `@LooselyConsistentValue` (and the field is not `volatile`), reads and writes are not required to be atomic, so the flattened encoding can exceed 64 bits, up to some threshold. (Array components are always padded up to some power of 2.)

      Attachments

        Issue Links

          Activity

            People

              dsimms David Simms
              dlsmith Dan Smith
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: