Loading...

Type: JEP
Resolution: Unresolved
Priority: P4
Fix Version/s: None
Component/s: hotspot
Labels:
None

JEP Type:
Feature
Exposure:
Open
Subcomponent:
runtime
Scope:
SE
Discussion:
valhalla dash dev at openjdk dot org
Effort:
M
Duration:
L

Summary

Support an optional new field initialization discipline in the JVM that ensures fields are explicitly initialized before they are read.

Goals

Allow fields declared in class files to opt in to a more structured initialization discipline, ensuring that the fields are always set before they are read and, if final, never modified after they are read.
Enable run-time optimizations for these fields by enforcing the initialization discipline via verification-time and run-time checks. Enhance the StackMapTable attribute as necessary to express field initialization status during construction.
Provide tools to diagnose initialization bugs releated to static fields, even when those fields have not opted in to the new discipline.

Non-Goals

It is not a goal to introduce any new Java language features, such as a new modifier for fields.
It is not a goal to change javac compilation strategies in order to impose the initialization discipline in the bytecode of existing Java programs.

Motivation

The Java Platform specifies that all variables are initialized before use, ensuring that a program can never read from uninitialized memory. For the fields of a class—both instance fields and static fields—this is handled by implicitly setting the field to a default value before any code in the class is run. This value is always some form of "zero": the number 0, the boolean false, or a null reference.

Default values are a mixed blessing: they provide a straightforward safety net ensuring the program never observes uninitialized memory, but they can often be misinterpreted as legitimate data, not just a "nothing written yet" signal.

A null value, for example, may be read from a field and then passed on to other methods and constructors, only to trigger a NullPointerException somewhere far from where the field was read. Since Java 14, Helpful NullPointerExceptions have made it easier to pinpoint the source of the error within a line of code, but these error messages can't direct the developer back to an initialization bug that supplied the null in the first place.

The Java Platform also specifies that variables declared final cannot be mutated, ensuring that any two reads from a final variable will produce the same value. However, there is an exception to this rule for final fields while a class or instance is being initialized—the program may read different field values at different times as fields are set to their intended values.

Field initialization bugs in practice

The following example illustrates both of these field initialization problems: unexpected default values and inconsistent final fields. In these classes, the final field appID may be read by Log before it has been assigned its proper value. When this happens, different program components may end up working with conflicting field values.

class App {
    public static final long appID
            = Log.currentPID(); // [1], [4], [6]

    public static void main() {
        IO.println("App[" + appID + "] has started");
        ...
        Log.log("Completed 'main'");
    }
}

class Log { // [2]
    private static final String prefix
            = "App[" + App.appID + "]: "; // [3]

    public static void log(String msg) {
        IO.println(prefix + msg);
    }

    public static long currentPID() {
        return ProcessHandle.current().pid(); // [5]
    }
}

When the class App is run from the command line, the output looks like:

App[96052] has started
App[0]: Completed 'main'

The discrepancy between ID numbers arises because the invocation of Log.currentPID() by class App [1] implicitly triggers initialization of the Log class [2], and during that class's initialization, the default value of the appID field is read [3]. That 0 value is then embedded into the prefix string. Eventually, the currentPID() call will proceed [4], producing the current process's ID number [5], which is finally assigned to appID [6]; but that assignment will be too late for the prefix field.

In complex systems, these sorts of bugs can be very hard to recognize and diagnose. One subtlety is that the order of initialization matters: if the Log class gets initialized first, the bug does not occur. Another subtlety is that the circular dependency between classes App and Log is easy to create by mistake and easy to overlook later; if the utility method currentPID were declared in some other class, the circularity would not exist and everything would behave properly.

Most kinds of Java variables do not suffer from these problems: a local variable must be explicitly assigned before it is read, and a final local variable may only be assigned once. Fields are unique in their reliance on default values.

A strict approach to field initialization

We propose an alternative approach to initializing fields, both final and non-final. Instead of every field being initialized to a default value when it is created, we alter the JVM to ensure that some fields, designated strictly-initialized, are explicitly initialized in bytecode before they are read. Compilers like javac are responsible for choosing which fields are designated strictly-initialized based on the language features used in source code. We call this a strict approach because it imposes additional restrictions on the code that initializes fields.

Strict initialization makes it impossible to have unexpected default values and inconsistent final fields. Every read from a strictly-initialized field observes a previously-written value, and if the field is final, every read observes the same value. Intuitively, these rules are what most developers expect from fields, but strict initialization promotes these rules from developer intuitions to integrity constraints, enforced by the JVM.

Strict field initialization improves program integrity

Strict initialization lays the foundation for some exciting new Java language features:

Value classes are new kinds of classes whose instances lack identity and can never be mutated. It is essential that the final instance fields of a value class instance always be observed with the same value.
Null-restricted fields are fields that can never store null. It is essential that these fields, both static and instance, not use null as a default value. But in many cases, there is no suitable alternative. Instead, these fields need be explicitly initialized with a non-null value before they can be read.

As shown above, the behavior of field initialization can be delicate; the JVM must not impose new initialization behavior on existing programs in case they depend on the existing behavior. New language features, on the other hand, can define new rules and behaviors for field initialization, and then adopt the JVM's strict initialization discipline. As the language evolves and programs adopt new features, program components will be hardened against field initialization bugs.

Description

A strictly-initialized field does not have a default value. It cannot be read before it has been explicitly initialized, and if it is final, all subsequent reads produce the same value. Compilers mark fields that are subject to strict initialization with a flag in the class file, ACC_STRICT_INIT (0x0800).

The JVM enforces these invariants of strictly-initialized fields at run time:

For a static field, an exception is thrown during class initialization if a read attempts to access the field before it has been initialized, or if class initialization completes without initializing the field; or, when the field is final, if a write attempts to mutate the field after it has already been initialized and read.
For an instance field, verification fails if a read attempts to access the field before the super() constructor invocation, or if the super() constructor invocation can be reached without initializing the field; or, when the field is final, if a write attempts to mutate the field after the super() constructor invocation.

The invariants of fields marked as ACC_STRICT_INIT provide the JVM with opportunities to optimize uses of those fields at run time. For example, in JDK NN HotSpot's JIT compiler treats strictly-initialized final fields as trusted. A trusted final field is known to never change, so once a value has been read from it, subsequent reads can re-use that same value. As a result, JIT-compiled code has fewer interactions with memory and may execute faster.

Below, we'll review the class initialization process in the JVM and discuss new rules for strictly-initialized static fields in more depth; then we'll review the instance initialization process and discuss new rules for strictly-initialized instance fields.

This is a preview VM feature, disabled by default

The ACC_STRICT_INIT flag that denotes a strictl-initialized field is only recognized in class files with a preview version number (XX.65535), and only when preview features are enabled at run time.

To enable preview features at run time, use the --enable-preview command-line argument:

java --enable-preview Main

Value Classes and Objects rely on strict field initialization: compilers mark all the fields of value classes as ACC_STRICT_INIT. To program with value classes, you must enable preview features at compile time and run time; this enables both value classes and strict field initialization in the JVM. However, strict field initialization is a standalone feature in the JVM: it does not assume that value classes exist and it can be used by compilers of non-Java languages. Regardless of the compiler, class files with fields marked as ACC_STRICT_INIT can only be loaded if preview features are enabled at run time.

Background on class initialization

Whenever a class is loaded by the JVM, it needs to be initialized. In bytecode, each class and interface can declare a special class initialization method, named <clinit>, for this purpose. The class initialization method is free to execute arbitrary code, and what constitutes an "initialized" class is up to the discretion of the class author. Usually class initialization includes setting all of the class's static fields to an appropriate initial value; it may also involve interactions with global state.

In Java code, class initialization methods are not written directly, but are an aggregation of each class's static field initializers and static initializer blocks.

Each class in a hierarchy may have its own <clinit> method, and every superclass must be initialized before executing the <clinit> method of a subclass.

Classes that have started but not finished their initialization process can be considered larval: developing, but not yet fully-formed.

An initialization state is used to track the status of each class at run time. In today's JVM (see JVMS 5.5), a class's initialization state may be any of the following:

Uninitialized: The class is loaded but has not yet attempted initialization.
Larval (within a particular thread): The class is currently being initialized.
Initialized: The class has successfully completed initialization, and can be used without restriction.
Erroneous: The class failed initialization and may not be used.

The <clinit> method executes while the class is in a larval state. The class is not yet initialized at this point, but its fields and methods can be freely accessed by code running in the current thread. If the <clinit> method completes successfully, the class transitions to the initialized state. If an exception occurs, the class transitions to the erroneous state and can never become initialized.

The constraints on class initialization are enforced dynamically, at run time. For example, each getstatic instruction is responsible for checking the initialization state of the resolved field's class. If the class is not initialized, but is in a larval state in another thread, getstatic blocks until initialization completes.

Strict initialization of static fields

With this JEP, the larval class initialization state is enhanced to keep track of whether each static field of the class has been set, and whether each static field of the class has been read.

When executing a putstatic or getstatic instruction, if the resolved field is declared by a class in a larval state in the current thread, the state is updated to record that the field has been set (by putstatic) or read (by getstatic). This occurs even if the field is accessed from another method or class, and even if the field is referenced as a member of a subclass.

Some fields are declared with a ConstantValue attribute, and these fields are always considered set.

With this information, the JVM can enforce the invariants of strictly-initialized static fields as follows:

If a getstatic instruction attempts to read from a strictly-initialized field declared by a class in a larval state, and that field is not yet set, an exception is thrown, indicating that the field cannot yet be read.
If a putstatic instruction attempts to write to a strictly-initialized final field declared by a class in a larval state, and that field has already been read, an exception is thrown, indicating that the field can no longer be set.
Just before a class transitions to the initialized state, its larval state is checked to ensure that every strictly-initialized static field has been set; if not, an exception is thrown, indicating that the field must be explicitly set during class initialization.

(In some complex cases, such as due to exception handling, a static final field may be written multiple times during initialization. This is allowed, but only the ultimate value of the field will be readable.)

If a static field is read or written reflectively during class initialization (e.g. via the Field or VarHandle classes), the above rules are still enforced.

Once a class has transitioned to the initialized state, all its strictly-initialized fields have been set, and the initialization state no longer needs to keep track of static field state.

Static field initialization diagnostics

Static fields that have not been designated strictly-initialized can also benefit from tracking their state during class initialization. As a debugging tool, HotSpot provides class initialization diagnostics via the command-line flag -XX:CheckAllStaticsStrictly=[warn|error|jfr] or -Xlog:strict+static=warning.

With these diagnostics turned on, whenever any non-strict static field is read during class initialization before it has been set or, in the case of a final field, mutated after it has been read, a diagnostic is generated.

The command-line flag specifies whether the diagnostic takes the form of a fatal error or an event logged to the console and JFR.

Background on instance initialization

Whenever a class instance is created with the new bytecode, that instance needs to be initialized. In bytecode, each class can declare multiple special instance initialization methods, named <init>, for this purpose. These methods are free to execute arbitrary code, and through a chain of <init> method invocations, every class in an inheritance hierarchy can define what constitutes an "initialized" class instance, at the discretion of each class author. Usually instance initialization includes setting all of the object's instance fields to an appropriate initial value; in may also involve interactions with static fields or global state.

In Java code, instance initialization methods are mainly expressed with constructors, and delegation between constructors is expressed with super(...) or this(...) calls. Instance initialization methods may also aggregate a class's instance field initializers and instance initializer blocks.

Each class in a hierarchy has at least one <init> method, and that method must, at some point before it completes, delegate to another <init> method of the current class or its superclass. This recursion bottoms out at Object.<init>.

Instances that have started but not finished their initialization process can, like classes, be considered larval: developing, but not yet fully-formed.

Like classes, objects have an initialization state, although this is expressed only indirectly in the JVM Specification. Today, an object's initialization state may be any of the following:

Uninitialized: The object has been created by new, but has not yet attempted initialization.
Restricted larval: The object is currently being initialized, and limited operations are available.
Unrestricted larval: The object is currently being initialized, but can be used without restriction.
Initialized: The object has successfully completed initialization.
Erroneous: The object failed initialization and may not be used.

An <init> method begins execution in the restricted larval state. Most operations, including method invocations, are not allowed on an object in the restricted larval state, and the object may not be shared with other code. However, its fields may be assigned with putfield. At some point another <init> method is invoked and the initialization process continues recursively, eventually reaching Object.<init>. At that point, the instance transitions to the unrestricted larval state and, one by one, the recursively invoked <init> methods complete their execution and return. In the unrestricted larval state, use of the object, including its fields and methods, is unrestricted. (The object may even be shared across threads.) The object is initialized once the outermost <init> method returns successfully. Alternatively, any <init> call in the stack might fail with an exception; in that case, the object transitions to the erronous state and can never become initialized.

The constraints on instance initialization are enforced statically, by the verifier. Verification determines a type state for each instruction, and that type state is either restricted (for objects in the restricted larval state) or unrestricted (for objects in the unrestricted larval and initialized states, and for static method bodies). A restricted type state is indicated with flagThisUninit.

For instructions operating on restricted type states, the verifier prevents most operations on the current object, and ensures that an unrestricted type state can only be reached via a chain of recursively delegating <init> calls that eventually reaches Object.<init>. The return instruction, which makes a newly constructed object available to the caller of <init>, is only allowed in an unrestricted type state.

Strict initialization of instance fields

With this JEP, the restricted larval instance initialization state is enhanced to keep track of whether each instance field of the class has been set.

In the verifier, this is expressed with a restricted type state that carries a list of all the current class's strictly-initialized instance fields that have not yet been set. A putfield on the current class instance in a restricted type state removes the named field from the list.

The enhanced type state supports the following rules to enforce the invariants of strictly-initialized instance fields:

An invokespecial of an <init> method, applied to the current class instance in a restricted type state, requires that if the invocation is of a superclass method, the list of unset fields must be empty. (If the invocation is of another <init> method of the same class, there is no such requirement—the invoked method is responsible for setting the fields.)
A putfield instruction writing to a strictly-initialized final field of the current class is only allowed in a restricted type state. (In contrast, putfield is allowed throughout the body of an <init> method for final fields that are not strictly-initialized.)

It has never been permitted to use getfield on an instance in a restricted type state. Thus, there is no rule for getfield analogous to the getstatic rule for static fields, and no need to track whether final fields have been read.

Jumps between restricted and unrestricted type states are not allowed. Jumps between different restricted type states are allowed, as long as the jump is to a type state in which fewer fields are set.

These verification rules ensure that all strictly-initialized fields of an object will be set while it is in a restricted larval state, before any reads can occur, and that no strictly-initialized final fields will be mutated once the object enters the unrestricted larval state. When the code executes, there is no need for additional checks to enforce the initialization invariants.

The StackMapTable attribute expresses the expected incoming type state for a jump target. In the past, a restricted type state has been expressed simply by including the special type uninitializedThis in the list of local variables. But when a class has strictly-initialized fields, the type state may also need to indicate whether each each field has been set. This is accomplished with a new kind of StackMapTable entry, restricted_frame:

```
/* NOTE: tentatively considering the renaming
   early_larval_frame --> restricted_frame */
restricted_frame {
    u1 frame_type = RESTRICTED; /* 246 */
    u2 number_of_unset_fields;
    u2 unset_fields[number_of_unset_fields];
        // array of NameAndType constants
    base_stack_map_frame base_frame;
        // any other kind of stack frame
}
```

Alternatively, if a stack frame has any other frame_type but mentions uninitializedThis, the stack frame is implicitly restricted, with unset fields inferred as whatever fields were unset in the previous frame.

Strictly-initialized final fields cannot be mutated by deep reflection

Some applications and frameworks use deep reflection to manipulate an object's private or final fields after instance initialization completes. In JDK 26, the mutation of final fields by deep reflection is permitted but causes a warning; in a future release, application developers will have to explicitly enable the capability at startup. See JEP 500 for more information.

The mutation of strictly-initialized final fields by deep reflection is inconsistent with the strict initialization invariants: different reads of the same final field can observe different values. So the Field.setAccessible method categorizes these fields as non-modifiable (just as it does for static final fields and the final fields of record classes), and attempting to set a strictly-initialized final field always throws an IllegalAccessException. Using --enable-final-field-mutation=... in JDK 26 or later does not allow mutation of these non-modifiable fields.

Developers interested in reflectively setting the strictly-initialized final instance fields of a class need to delegate to one of the class's constructors, which has exclusive permission to assign to the field.

Strictly-initialized fields require custom deserialization

Object deserialization in the standard library is implemented by skipping the usual execution of an <init> method in the class being instantiated. Instead, the ObjectInputStream API provides its own "constructor" via reflective library code. Much like deep reflection, this capability bypasses the verification-based enforcement of constraints on strictly-initialized instance fields, and must not be used for classes that declare these fields.

Accordingly, ObjectOutputStream.writeObject and ObjectInputStream.readObject throw an InvalidClassException if a class being serialized or deserialized declares a strictly-initialized instance field (and the class is not a record class).

Users of serialization can implement the writeReplace and readResolve methods to avoid this exception. Doing so causes a replacement object to be serialized and deserialized instead of the object with strictly-initialized fields.

(A future enhancement to serialization is anticipated, allowing class authors to declare special constructors that ObjectInputStream.readObject can use to create new instances from the data in serialization streams, even if the class has strictly-initialized instance fields.)

Supporting changes

The Field.accessFlags and Field.getModifiers methods should reflect the presence of ACC_STRICT_INIT.

The java.lang.classfile API should support ACC_STRICT_INIT and restricted_frame entries in StackMapTable. When a StackMapTable is automatically generated, it should properly encode the initialization state of strictly-initialized fields.

The javap tool should properly display the ACC_STRICT_INIT modifier and restricted_frames; it should also do a better job of presenting the implicit initialization states in a StackMapTable.

The asmtools tools should similarly be updated to support ACC_STRICT_INIT and restricted_frame.

Alternatives

In JDK 21, the javac compiler added warnings to discourage invocations of instance methods from superclass constructors (see JDK-8299995). Such warnings are helpful, but of course are no substitute for invariants enforced by the JVM.
We've considered approaches that enforce instance field invariants with dynamic checks. These would allow more flexibility in the timing of instance field initialization. Unfortunately, they require a run-time overhead that is not easily optimized away once the object has been fully initialized.

Risks and Assumptions

Before Java SE 17, the ACC_STRICT flag, also 0x0800, was applied to methods to indicate a requirement for "strict" floating-point semantics. That capability was removed by JEP 306. The two flags are unrelated, beyond their similar names.

blocks

JDK-8367935 [lworld] Rename ACC_STRICT in the JVM according to strict fields JEP

Resolved

1.

JVM implementation of strict field initialization

Open

Matias Saavedra Silva

Details

Description

Summary

Goals

Non-Goals

Motivation

Field initialization bugs in practice

A strict approach to field initialization

Strict field initialization improves program integrity

Description

This is a preview VM feature, disabled by default

Background on class initialization

Strict initialization of static fields

Static field initialization diagnostics

Background on instance initialization

Strict initialization of instance fields

Strictly-initialized final fields cannot be mutated by deep reflection

Strictly-initialized fields require custom deserialization

Supporting changes

Alternatives

Risks and Assumptions

Attachments

Issue Links

Sub-Tasks

Activity

People

Dates