Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8209964

Lazy Static Final Fields

    XMLWordPrintable

    Details

    • Type: JEP
    • Status: Draft
    • Priority: P3
    • Resolution: Unresolved
    • Fix Version/s: None
    • Component/s: tools
    • Labels:
      None
    • JEP Type:
      Feature
    • Exposure:
      Open
    • Scope:
      JDK

      Description

      Summary

      Expand the behavior of final variables to include optional lazy evaluation patterns, in language and JVM. In doing so, extend Java's pre-existing lazy evaluation mechanisms to per-variable granularity, from its current per-class granularity.

      Motivation

      Java uses lazy evaluation pervasively. Almost every linkage operation potentially triggers a lazy evaluation, such as the execution of a <clinit> method (class initializer bytecode) or invocation of a bootstrap method (for an invokedynamic call site or CONSTANT_Dynamic constant).

      Class initializers are coarse-grained compared to mechanisms using bootstrap methods, because their contract is to run all initialization code for a whole class, rather than some initialization that may pertain to a particular field of that class. Such coarse-grained initialization effects make it especially difficult to predict and isolate the side effects of using one static field from the class, since computing the value of one field entails computation of all static fields in the same class.

      So touching one field touches them all. In AOT compilers, this makes it difficult to optimize a static field reference, even if the field has a clearly analyzable constant value. It only takes one extra-complicated static field in a class to make all fields non-optimizable. A similar problem appears with proposed mechanisms for constant-folding (at javac time) constant fields with complex initializers.

      As an example of an extra-complicated static field initialization, which in some codebases appears in almost every file, consider logger initialization:

      private final static Logger LOGGER = Logger.getLogger("com.foo.Bar");

      This harmless-looking initialization triggers a tremendous amount of behind-the-scenes activity at class initialization time – though it is unlikely that the logger is needed at class initialization time, or even at all. Deferring the creation to first use would streamline initialization, and might result in optimizing away the initialization entirely.

      Final variables are very useful; they are the main mechanism for Java APIs to denote constant values. Lazy variables are also well-proven. Since Java 7 they have been an increasingly important part of JDK internals, expressed via the internal @Stable annotation. The JIT can optimize both final and "stable" variables more fully than other variables. Adding lazy finals will these useful design patterns usable in more places. Finally, their adoption will allow libraries such as the JDK to downsize their reliance on <clinit> code, with likely improvement to startup and AOT optimizations.

      Description

      A field may be declared with a new modifier lazy, a contextual keyword recognized only as a modifier. Such a field is called a lazy field, and must also be static and final.

      (In other accounts of this idea, lazy statics are marked using other variations of modifier syntax, such as __LazyStatic or lazy-static. Details are TBD.)

      A lazy static field must be supplied with an initializer. The compiler and runtime arrange to execute the initializer on the first use of the variable, not when the containing scope (the class) is initialized.

      (The initialization of a lazy static field, being handled by this special mechannism, is therefore not also present in any <clinit> method generated by the compiler.)

      Each lazy static final field is associated at compile time with a constant pool entry which supplies its value. Since constant pool entries are themselves lazily computed, this is sufficient to assign a well-defined value to any static lazy final variable associated with the constant pool entry. The name of the attribute is LazyValue, and it must refer to a constant pool entry that can be ldc-ed to a value that can be converted to the type of the lazy field. The allowed conversions are the same as those used by MethodHandle.invoke.

      (In principle more than one lazy variable could be associated with a single constant pool entry, although this is not envisioned as a useful feature. Such grouping patterns can be built by hand on top of the basic non-grouped language feature.)

      Thus, a lazy static field may be viewed as a named alias of a constant pool entry within the class that defined the field. Tools such as compilers may exploit this property.

      A lazy static field is never a constant variable (in the sense of JLS 4.12.4) and is explicitly excluded from contributing to a constant expression (in the sense of JLS 15.28). Thus, it never possesses a ConstantValue attribute, even if its initializer is a constant expression. Instead, a lazy field possesses a new kind of classfile attribute called LazyValue, which the JVM consults when linking a reference to that particular field. The format of this new attribute is similar to the old one, because it also points to a constant pool entry, in this case the one which resolves the field value.

      When linking a lazy static field, the normal process of executing class initializers is not bypassed. Instead, any <clinit> method on the declaring class is executed according to the rules of JVMS 5.5. In other words, a getstatic bytecode of a lazy static field performs any linkage actions associated with any static field, except the lazy ones.

      (When a class is initialized due to linkage of a normal static field, the class's initialization at that time does not involve lazy static fields in any way; it is as if they are not declared, precisely because they are not initialized by the actions of any <clinit> block.)

      After initialization (or during an already-started initialization in the current thread), the JVM then resolves the constant pool entry associated with the field, and stores the value of that constant pool entry into that field.

      Since lazy static final fields cannot be blank finals, they cannot be assigned to, even in those limited contexts where blank finals may be assigned to.

      There is a rule in Java which requires that a static variable may only appear in the initializers of static variables which occur later on in the class body. This rule reduces (but does not eliminate) the possibility that an untimely read of a static variable may obtain the default value of that varaible, rather than its initial value.

      class C {
        static int x = y; //error: illegal forward reference
        static int y = 42;
      }

      These ordering constraints are observed even for lazy static fields, as if they were not declared lazy. Thus, a lazy static field's initializer can only refer to a static field of the same class that occurs earlier in the same source file.

      If in some case two lazy values must depend on each other in a circular relationship, the cycle can be hidden by the use of a private static method. In that case, a true cyclic dependency will cause a stack overflow error. In the case of non-lazy statics, an analogous cycle would cause a default value to become visible.

      class C {
        //lazy static final Object x = y, y = x; //error
        lazy static final Object x = ycycle(), y = x;
        private static Object ycycle() { return y; }
      }

      Any non-lazy static field initializer or class initializer block may also refer to a lazy static field value that precedes in the the source file. This is usually not desirable, as it would tend to cancel the benefit of the lazy field, but may be useful in combination with conditional expressions or control flow.

      The purpose of the ordering rule is to require the user to specify a nominal initialization order for lazy statics. The actual dynamic initialization order may differ, but the nominal order serves to demonstrate statically that there are no unintentional cyclic dependencies between the statics, lazy and otherwise.

      Lazy fields may be recognized by the core reflection API by use of two new API points on java.lang.reflect.Field. The new query method isLazy returns true if and only if the field was declared lazy. The new query method isAssigned returns false if and only if the field is lazy and has not been initialized, at the moment the method is called. (It may return true on the very next call in the same thread, depending on race conditions.) Other than isAssigned, there is no way to observe whether a lazy field has been initialized yet.

      (The isAssigned reflective call is provided only to assist with occasional problems with circular initialization dependencies. Perhaps we can get away without implementing it, although people who code with lazy variables occasionally want to ask gently whether a lazy variable is set yet, in the same way that users of mutexes occasionally want to ask whether a mutex is locked, but without actually seizing the lock.)

      To preserve implementation freedom, the contract of isAssigned is minimized. If a JVM can prove that a lazy static variable can be initialized without observable side effects, it may do so at any time; in such a case the isAssigned query will report true even before any getfield is executed. The minimized contract for isAssigned is that if it returns false, none of the side effects from initializing that variable have yet been observed by the current thread, whereas if it returns true, then the current thread can, in the future, observe all side effects of initialization. This contract allows compilers to substitute ldc for getstatic of their own fields, and allows JVMs to avoid tracking detailed initialization states of finals with shared or degenerate constant pool entries.

      Multiple threads may race to initialize a lazy final. As is already the case with CONSTANT_Dynamic constant pool entries, the JVM picks an arbitrary winner of such a race and provides the value from that winner to all racing threads, as well as recording it for all future accesses. Thus, JVM implementations may elect to use CAS operations, if the platform supports those, to resolve races.

      When the JVM stores a value into a lazy final field, it performs a freeze operation. This freeze happens before any getstatic instruction is allowed to see the field value. This is how pre-existing rules for safe publication apply to lazy finals.

      The effect of a lazy final is closely similar to the effect of a static final defined on its own class, with no other static finals.

      class C { lazy static final Object x = xval(), y = yval(); }
      f() { ... getstatic C.x ... }
      =>
      class C_x { static final Object x = xval(); }
      class C_y { static final Object y = yval(); }
      f() { ... getstatic C_x.x ... }

      The difference is that a true cyclic dependency between lazy statics will cause a stack overflow, rather than the observation of a default value.

      Note that a class can convert a static to a lazy static without breaking binary compatibility. A client's getstatic instruction is identical in both cases. When the variable's declaration changes to lazy, then the getstatic instruction links differently.

      Operational Description

      A class with no lazy static fields is initialized in one pass. A class with N lazy static fields is initialized in 1+N passes. The first pass always initializes all of the normal static fields (and not the lazy static fields), in the order they occur in the source code, as in all prior versions of the Java Language Specificaiton. All of the passes are initiated by threads which are attempting to access the class, but the passes may also be run in distinct threads.

      The first pass is run to completion in a thread chosen from among those threads which are first to perform an initializing access (of any kind) to the class. Racing threads are paused until the chosen thread completes the initialization.

      If the initializing thread reads a normal static before that same thread has initialized it, the default initial value, such as null, will be seen as the value of that field.

      If the initialization of any normal static variable fails with an exception, then the initialization of the class as a whole fails.

      Each other pass (for a lazy static) is run to completion in a thread which is chosen from among those threads which are first to perform an initializing access to that lazy static. The pass evaluates the initializer of the lazy static and produces either a correctly typed initial value for that lazy static, or throws an exception.

      If an exception is thrown, then all uses of the lazy static, including the first and those of any threads racing to initialize, will throw that same exception. Otherwise, all uses of that lazy static, including racing threads and future uses, will return that same value. Unlike normal statics, initialization failure does not cause the class as a whole to fail to initialize.

      Unless the initializer returns the default value of that field's type, no use of that lazy static will see the default value of that field's type. This behavior is unlike that of normal statics.

      Threads other than that chosen to initialize a lazy static may race to complete its initialization, because they attempted to read the lazy static before it was initialized. During the course of its race, such a thread may return any value or throw any exception, which will be ignored. The result returned to that thread, as to any other thread, will be the value or exception produced by the chosen thread.

      If an initializing thread reads, directly or indirectly, the value of the lazy static before the initialization value has been chosen, a stack overflow error will be thrown instead of completing the initialization. If that thread is in fact the one chosen to initialize the lazy static, then that error will be the result of reading the lazy static in all threads at all times.

      (The initialization rules for lazy statics are necessarily distinct from those for normal statics, and are fully coherent with the rules for CONSTANT_Dynamic constants in constant pools. They may be consider a surfacing of constant pool semantics into the language.)

      If the first field in a class that is linked is a lazy static field, then the first pass initializes all the normal static fields. Next, the lazy static field is initialized. If several threads are attempting to initialize the same lazy static field, one of them is chosen to initialize all of the normal static fields while all others are paused. After that pass, in a second pass, one or more threads are allowed to compute the initializer, and one result (either a value or an exception) is chosen for the lazy static.

      (It is typical but not required that the same thread will perform both passes. The initialization of normal statics can be ordered freely relative to the initialization of any lazy static, as long as all normal statics are fully initialized before the initialization of any lazy static is commenced.)

      (If some future system allows statics to be initialized before the main method is entered, then these ordering rules will still apply, which means, for some classes, all normal statics and some lazy statics may be initialized before the invocation of the main method.)

      Subsequently, linking any of the other N-1 lazy static fields will initialize just that lazy static field, again by choosing a requesting thread to produce either a value or an exception.

      If all the lazy static fields in the class are eventually linked, then the number of passes to fully initialize the class is 1+N.

      (Either way, all the normal static fields, and no lazy static fields, are initialized in the first pass. This "eager" initialization of normal static fields when the first static field is linked is consistent with prior releases of Java.)

      Alternatives

      Use nested classes as holders for single lazy variables.

      Define some sort of library API for managing lazy values or (more generally) monotonic data.

      Refactor would-be lazy static variables as nullary static methods and populate their bodies with ldc of CONSTANT_Dynamic constants, by some means.

      Use non-final variables for publication of lazily evaluated data, being careful not to modify them, and to fence their initialization for safe publication.

      (N.B. The above workarounds do not provide a binary-compatible way to evolve existing static constants away from their current reliance on <clinit>.)

      In the direction of adding more functionality, we could allow lazy fields to be non-static and/or non-final, preserving current correspondences and analogies between static and non-static field behaviors. The constant pool cannot be a backing store for non-static fields, but it can still contribute bootstrap methods (that depend on the current instance). Frozen arrays (if implemented) could be given lazy variations, perhaps. Such investigations seem plausible as a follow-on projects for the current proposal.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              jrose John Rose
              Owner:
              John Rose John Rose
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

                Dates

                Created:
                Updated: