Loading...

Type: Enhancement
Resolution: Unresolved
Priority: P3
Fix Version/s: tbd
Affects Version/s: 23
Component/s: hotspot
Labels:
- repo-valhalla

Subcomponent:
gc

## Summary

Have the GC support a small number (10-100) of *quasinull* values, as an internal optimization tool internal to the VM.

## Motivation

Project Valhalla, and JEP 401, lays the foundation for a long series of heap layout optimizations. Some proposed layout optimizations require a new kind of encoding in managed references. Specifically, types like `java.util.Optional` are organized around a single managed reference, an encapsulation pattern we expect will be common.

When converting such a class to a value classe, the VM takes on a new requirement, which is to create spec-compliant but efficient heap layouts for fields and array elements of that value class. The efficiency challenge is to which reduce size and number of indirections, improving memory density and flatness. Those are core value propositions for Project Valhalla.

A field or array element (generically, "heap container") for a value must provide enough bits to encode, not only all possible concrete values of the value class, but also the special null reference value, which indicates that no value is present.

(There is an occasional exception to this requirement: If the VM is somehow told that nulls are impossible for a particular heap container, the VM is not required to provide an encoding for the null state. But that only applies to specially configured heap containers, and we will not further discuss that option in this RFE.)

The bits and/or encodings that help encode the null reference value are referred to as the "null channel". As a first cut, Valhalla will allocate an extra byte to serve as a boolean that signals the presence of null. If it so signals null, by being zero, then all other allocated bits will be ignored. This technique is workable, but has two flaws: 1. it expands the size of heap containers, diluting Valhalla's promise of density, and 2. the expansion can cause the heap container to become too large to load and store with proper atomicity.

Atomicity is not just for `volatile` value class fields: It is a blanket requirement, in order to retain proper class encapsulation of multi-field values. An extra null channel byte becomes a hazard to atomicity, if it cannot be loaded and stored in the same atomic memory unit. (A atomic memory unit is a least a 64-bit word on all our platforms, and that is the largest such unit on many.)

Atomicity will not necessarily be broken by the null channel byte if the VM is running with compressed oops, since the required 4+1 bytes fits in a memory unit of 8 bytes. (There may well be a fragmentation waste of the reamining 3 bytes.) But for 8-byte managed references, atomicity probably fails. In that case, the VM might have to introduce, not only an extra byte to the layout, but a full indirection to an image of the one-word value "buffered" in a separate heap node. That is several words of overhead, just because the original memory unit overflowed because of its need for a null channel. Of course, the null channel technique in that case of overflow-to-buffer is simply the null reference, instead of a reference to the buffered value. Unpleasantly, this is the pre-Valhalla state of affairs, so the value proposition go away, for many heap containers containing objects like `Optional`.

(If VM finds evidence the nullability or atomicity requirements can be relaxed, this worst case loss of flattening can be avoided. But we will not further discuss these edge cases here.)

The root problem here with `Optional` (and all value classes like it) is that there are two kinds of null conditions relevant to its heap containers: *inner null* and *outer null*. An *outer null* is simply the state in which the container (logically) contains a null reference, instead of a value class instance. (This state must always be represented with all-zero-bits in the container, since the VM convention is that all-zero-bits is the default/initial value of every VM type.) An *inner null* by contrast, is the state where the value class instance is present, but the class's managed reference field (logically) a null value. (For example, `Optional` uses a null field value to encode the "not present" state typical to optional, but other values will have other uses for inner null.) Clearly, there is a danger that inner null and outer null will be confused, stored using the some bit configuration. They must be distinguished for correct execution, requiring (again) some kind of null channel.

## Description

At last, we come to the "ask" of this RFE: A preferable alternative to a null channel byte is a quasinull. A *quasinull* is a managed reference value which is neither a literal null value (all zero bits) nor a valid reference to any heap object. The VM, and specifically the GC, should allow about 10-100 quasinull values which are distinct from null (and from each other) yet have the same basic characterstics (except equality) as null itself.

Since nulls are encoded as out of bounds pointers, with zero value, it seems likely that additional quasinull references can be supported by the VM by using other small integers, such as 1 or 2 or maybe 127.

Much null detection logic in the VM, including all NPE checks, can easily, and harmlessly, be adjusted to treat quasinulls as aliases of null. A comparison against zero would be replaced by an unsigned comparison against a small value (a hardcoded quasinull limit in the range 10-100). Trap-based null detection logic would be almost unchanged, since the zero address is on the same (inaccessible) page as the other quasinull addresses.

The quasinull limit should be be both hardcoded (for performance) and configurable (for testing); this is possible as long as the configurable value cannot exceed the hardcoded value, but may be zero or one, for certain testing scenarios. (Suggested: a diagnostic option `-XX:MaxQuasiNull=N`.) The heap would never use the addresses at or below the hardcoded limit (say, 31 or 127, depending on the capacity of platform CPU instruction immediates).

## Key use case: Layout flattening of nullable variables

For a one-field value class represented by a managed pointer, the layout could be improved as follows:

A. Compute N, the value-depth of the value, as one more than the value-depth the class's field if it is a value class itself, else 1. (This is computed by the class layout algorithm.)

B. If the VM does not supply at least N sentinels (note: before this RFE it supplies 0) then do not apply this optimization; use a fallback (like overflow-to-buffer) that requires more memory footprint and/or indirections.

C. Adjust all system components (interpreter, JIT, JNI) which access (load or store) the affected heap container to treate the zero state as an outer null, and the Nth sentinel (when it appears in the value class field) as an inner null.

As a result, the bit-image of an `Optional` heap container is 0, when it contains an outer null (no value present). But the bit-image of an `Optional` heap container is 1, when it contains an inner null (value present, but is the empty `Optional` instance). Any value other than 0 or 1 must be a true object reference (to a non-empty `Optional` instance payload).

The above optimization generalizes to multi-field value classes, and is useful if managed references are compressed to 4 bytes. The normal atomic memmory unit of 8 bytes allows 4 bytes for other value class fields. Just one reference field would be selected to carry the null channel, using a quasinull.

This RFE does not ask for any of the above layout optimizations, but simply requests that the GC start the ball rolling by supplying an internal C++ API for (a) sensing the number of quasinulls, and (b) providing the Nth quasinull to whatever VM component requests it. Different GCs can supply different numbers of quasinulls, including zero, for GCs that (somehow) cannot be enhanced.

Quasinulls are useful even in the presence of oops compression modes, so GCs are encouraged to support them for all compression modes. The core test is the same: An unsigned comparison against N, where N is the number of supported quasinull.

## Dependencies

Future work on the classfile parser's layout engine, and the interpreter, compilers, and JNI, could then use quasinulls to improve Valhalla flattening. A quasinull would never appear "on the stack" (in JVMS stack or locals). Instead, any quasinull would be converted to a proper null (whether inner or outer) by the heap access logic.

Perhaps, even more future work, in the JDK, could possibly use quasinulls explicitly as Java variables. At the Java level, a quasinull would appear to be a true null, throwing NPE when method-invoked or field-accessed. (This means VM implicit nulls checks must treat quasinulls as nulls, a noted above.) Bytecode operations would not distinguish quasinulls from null, preserving the integrity of the JVMS. (Not even `acmp` instructions would make the distinction. And `checkcast` might normalize quasinull to true null, to ease downstream value tracking in JITs. Some JIT code might require additional speculation or reasoning that only true nulls appear in some given variable.) There would be a restricted, JDK-only API for working with quasinulls, as follows:

```
class Quasinulls {
  /** Returns 0 for true null, >0 for quasinull, <0 for real object ref. */
  @IntrinsicCandidate static native int nullToRawBits(Object x);

  /** Returns number of possible quasinulls; can be zero. */
  @IntrinsicCandidate static native int maxQuasiNull();

  /** Returns the Nth quasinulls; throws VMError if oob or not positive. */
  @IntrinsicCandidate static native int quasiNull(int rawBits);

  /** Tells if the two references are the same, but distinguishing quasinulls. */
  @IntrinsicCandidate static native int quasiNull(int rawBits);

  static boolean isTrueNull(Object x) { return nullToRawBits(x) == 0; }
  static boolean isQuasiNull(Object x) { return nullToRawBits(x) > 0; }
}

value class Optional<T> {
  private static final Optional<?> EMPTY_INSTANCE = new Optional<>(Quasinulls.quasiNull(1));
  @NeverNullTrustMe private T value;
}
```

The automatic technique outlined above applies to more than just expert-tweaked classes like the `Optional` shown here, so is much preferable. But in some cases explicit use of quasinulls may be desirable.

The `Quasinulls` API is shown here, not to encourage the JDK to embrace such techniques, but simply as a possible option for engineering unit tests for the GC, *before* the VM adopts quasinull-based optimizations. Such unit tests should be developed under a separate RFE, since they require significant JIT work to implement and optimize away the required null-folding logic.

relates to

JDK-8317278 JVM implementation of value classes and objects

In Progress

JDK-8297156 low-level control of field initialization

Draft

Details

Description

Attachments

Issue Links

Activity

People

Dates