Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8251554 JEP 401: Value Classes and Objects (Preview)
  3. JDK-8317278

JVM implementation of value classes and objects

XMLWordPrintable

    • Icon: Sub-task Sub-task
    • Resolution: Unresolved
    • Icon: P4 P4
    • tbd
    • None
    • hotspot

      This task summarizes the JVM changes introduced by JEP 401. All features are activated by the `--enable-preview` launcher option.

      See full JVMS changes by following the "JVM changes" link at: https://cr.openjdk.org/~dlsmith/jep401/latest/

      The Strict Field Initialization in the JVM feature is a hard prerequisite to these changes (see JDK-8351990).

      ### Class file format

      The `value` modifier is encoded in a `class` file using the `ACC_IDENTITY` (`0x0020`) flag. 0 means "value", 1 means "identity". Interfaces must not have `ACC_IDENTITY` set.

      In non-preview-versioned `class` files, all classes (not interfaces) are considered to have `ACC_IDENTITY` set. (Historically, `0x0020` represented `ACC_SUPER`, and all classes, but not interfaces, were encouraged to set it. The flag is no longer meaningful, but coincidentally will tend to match this implicit behavior.)

      Format checking fails if a `value` class is neither `abstract` nor `final`, has a non-`final` or non-`strict` instance field, or has a `synchronized` instance method.


      ### Class loading

      At class load time, a value class must have a superclass that is either another value class or java/lang/Object. If not, the class fails to load.

      When preview features are enabled, some designated standard library classes in java.base (see JDK-8317279) are loaded from a separate location, allowing them to be compiled with a 65535 minor version number and use the appropriate features.


      ### LoadableDescriptors attribute

      Standard 'L' descriptors are used to refer to value classes, and these classes are typically loaded lazily, if at all. To enable optimizations that require earlier knowledge about value class layouts, the 'LoadableDescriptors' attribute of a class indicates that a set of referenced field descriptors (encoded as Utf8 constants) name classes that should be eagerly loaded to locate potentially-useful layout information.

      ```
      LoadableDescriptors_attribute {
          u2 attribute_name_index;
          u4 attribute_length;
          u2 number_of_descriptors;
          u2 descriptors[number_of_descriptors];
      }
      ```

      While mentioned classes are currently intended to be value classes, this is not an enforced requirement. There may be other applications of LoadableDescriptors in the future.

      During loading of the referencing class, the classes mentioned by LoadableDescriptors may be "speculatively loaded". This means that a recursive attempt may be made to load the mentioned class, but it may fail due to circularities (e.g., two classes mutually referencing each other in 'LoadableDescriptors'), and if so the JVM ignores the error; another attempt to load the class later might succeed.

      The classes mentioned by LoadableDescriptors are also allowed to be loaded during any phase of linking; any resulting errors must be discarded.


      ### Identity-sensitive operations

      - The `if_acmpeq` and `if_acmpne` operations implement the `==` test for value objects, comparing primitive fields by their bit patterns and reference-typed fields recursively with `acmpeq`. (The implementation is provided by java.lang.runtime.ValueObjectMethods.)

      - The `monitorenter` instruction throws an IdentityException if applied to a value object.


      ### Object encodings on stack

      In this release of HotSpot, value object references on the stack are encoded as follows. (These details are subject to change in future releases.)

      In the interpreter and C1, value object references on the stack are pointers to regular heap objects.

      In C2, value object references on the stack are typically scalarized when stored or passed with concrete value class types (according to the type inferred by C2, which may be more specific than the verification type). Scalarization represents each field as a separate variable, with an additional variable for metadata (modeling 'null' and caching a heap-allocated copy of the object). If a field has a concrete value class type, it is recursively scalarized. Value objects are always allocated on the heap when they need to be viewed as values of a supertype of the value class type.

      Methods with value-class-typed parameters support both a pointer-based entry point (for interpreter and C1 calls) and a scalarized entry point (for C2-to-C2 calls). These entry points are determined at preparation time, based on the `LoadableDescriptors` attribute. If a value class is not named by `LoadableDescriptors` (for example, if the class was an identity class at compile time), entry points may end up using a heap object encoding instead. In the case of a method overriding mismatch—a method and its super methods disagree about scalarization of a particular type—the overriding method may dynamically force callers to de-opt and use the pointer-based entry point.

      Reads from and writes to heap storage are responsible for converting the heap-encoded data (covered below) to the appropriate scalarized form. For example, a read may require allocating a new heap object. These conversions may be dynamic, such as when reading from/writing to a possibly-flattened array.


      ### Object encodings on heap

      To facilitate the special behavior of instructions like `if_acmpeq`, heap-allocated value objects are identified as such with a new flag in their object header.

      In this release of HotSpot, fields and array components of a concrete value class type will often use flattened storage, with value object references encoded as follows (these details are subject to change in future releases):

      - Nullable atomic: In the standard case, an extra byte field (set to '1' to represent "non-null") is added to the object's field layout. If the resulting field layout can fit in 8, 16, 32, or 64 bits, those bits are used as the object's flattened encoding. The 'null' reference is encoded as a zero of the appropriate size.

      - Reference: as a fallback, the field or array component uses the normal reference encoding, a pointer to a heap-allocated object.

      Field layouts are determined during class loading, with field types mentioned by `LoadableDescriptors` being speculatively loaded as needed to determine the layout.


      ### Null-restricted storage

      For internal testing purposes, this release supports some internal JDK APIs that enable additional heap flattening techniques. Because these behaviors are not specified by Java SE, these APIs should only be used by internal JDK code for experimental purposes and should not affect user-observable outcomes.

      The annotation `jdk.internal.vm.annotation.LooselyConsistentValue` can be applied to a value class, indicating that it is willing to tolerate data corruption caused by races. In a race condition, new instances of the class may be created from arbitrary combinations of existing instances' field values, without invoking a constructor.

      The annotation `jdk.internal.vm.annotation.NullRestricted` can be applied to a reference-typed field marked ACC_STRICT. This indicates that the field will never store `null`. Attempted `null` writes throw a NullPointerException.

      The method `jdk.internal.misc.VM.newNullRestrictedArray` supports the creation of arrays with components that behave like `@NullRestricted` fields. (An appropriate initial value must be provided by the caller.)

      These annotations enable two additional encodings for concrete value class types:

      - Null-restricted atomic: If the storage is null-restricted and the class layout can fit in 8, 16, 32, or 64 bits, those bits are used as the object's flattened encoding. No extra byte for tracking nulls is necessary.

      - Component-wise: If the storage is null-restricted and non-`volatile`, and if the value class is marked `@LooselyConsistenValue`, the class field layout is stored directly as a flattened encoding, with each field read and written independently. There is no hard size limit. (Array component sizes are always padded up to some power of 2.)

      When a field or array component can support multiple encodings, HotSpot prioritizes among the possibilities.

            dsimms David Simms
            dlsmith Dan Smith
            Votes:
            1 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated: