Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8300786

Statements before super()

    XMLWordPrintable

Details

    • JEP
    • Status: Submitted
    • P4
    • Resolution: Unresolved
    • None
    • specification
    • None
    • Feature
    • Open
    • JDK
    • amber dash dev at openjdk dot java dot net

    Description

      Summary

      Allow statements that do not reference the instance being created to appear before this() or super() in a constructor.

      Goals

      • Update the Java language to allow statements that do not reference the instance being created to appear in constructors prior to this() or super() calls.
      • Correct an error in the specification which defines constructor invocations as a static context.
      • Preserve existing safety and initialization guarantees for constructors.

      Non-Goals

      Modifications to the JVMS. These changes may prompt reconsideration of the JVMS's current restrictions on constructors, however, in order to avoid unnecessary linkage between JLS and JVMS changes, any such modifications should be proposed in a follow-on JEP. This JEP assumes no change to the current JVMS.

      Maximizing JLS and JVMS Alignment. Although these changes will bring the JLS and JVMS into closer alignment, it is not a goal to harmonize them. The JLS and JVMS address different problem domains, and therefore it is reasonable for them to differ in what they allow. For one example, the JVMS allows a constructor to write to the same final field multiple times, whereas the JLS does not.

      Changes to Current Behavior. There is no intention to change the behavior of any program following current JLS. This change strictly expands the universe of valid programs, without affecting existing ones.

      Addressing Larger Language Concerns. Thinking about the interplay between superclass constructors and subclass initialization has evolved since the Java language was first created. This work should be considered a pragmatic tweak rather than a statement on language design.

      Motivation

      As in most object-oriented languages, Java defines an explicit "construction" step that occurs after memory allocation but before "regular" use of an object. This acknowledges the fact that, in general, some initialization and setup of object state is required before objects can be safely used. In order to ensure orderly object initialization, Java specifies a variety of rules specifically related to object construction. For example, dataflow analysis verifies that all final fields have definite values assigned during construction.

      However, for classes in a non-trivial class hierarchy, object initialization does not occur in a single step. An object's state consists of the composition of groups of fields: the group of fields defined in the class itself, plus the groups of fields defined in each ancestor superclass. Initialization of each group of fields is performed as a separate step by a corresponding constructor in those fields' defining class. An object is not fully initialized until every class in the hierarchy has had its opportunity to initialize its own fields.

      To keep this process orderly, Java requires that superclass constructors execute prior to subclass constructors. The result is that Java objects are always initialized "top down". This ensures that at each level, a constructor may assume that the fields in all of its superclasses have already been initialized. This guarantee is important, as constructors often need to rely on some functionality in the superclass, and the superclass wouldn't be able to guarantee correct behavior without the assumption that its own initialization were complete. For example, it's common for a constructor to invoke superclass methods to configure or prepare the object for some specific task.

      In order to enforce this top down initialization, the Java language requires that invocations of this() or super() always appear as the first statement in a constructor. This indeed guarantees top down initialization, but it does so in a heavy-handed way, by taking what is really a semantic requirement ("Intialize the superclass before accessing the new instance") and enforcing it with a syntactic requirement ("super() or this() must literally be the first statement").

      A rule that more carefully addresses the requirement to ensure top down initialization would allow arbitrary statements prior to superclass construction, as long as the this instance remains hands-off until superclass construction completes. This would allow constructors to do any desired "housekeeping" prior to superclass construction. Such a rule would closely follow the familiar existing rules for blank final fields, where access is disallowed prior to initialization, the initialization must happen exactly once, and full access is permitted afterward.

      The fact that the current enforcement mechanism is unnecessarily restrictive is, in itself, a reason for change. There are also practical reasons to relax this restriction. For one, the current rules cause certain idioms commonly used within normal methods to be either difficult or impossible to use within constructors. Below are a few examples.

      Implementing "Fail Fast"

      A subclass constructor sometimes wishes to enforce a requirement on a parameter that is also passed up to the superclass constructor. Today such requirements can only be applied "inline" e.g., using static methods, or after the fact.

      For example:

      public class PositiveBigInteger extends BigInteger {
      
          public PositiveBigInteger(long value) {
              super(PositiveBigInteger.verifyPositive(value));
          }
      
          // This logic really belongs in the constructor
          private static verifyPositive(long value) {
              if (value <= 0)
                  throw new IllegalArgumentException("non-positive value");
          }
      }

      or:

      public class PositiveBigInteger extends BigInteger {
      
          public PositiveBigInteger(long value) {
              super(value);   // potentially doing useless work here
              if (value <= 0)
                  throw new IllegalArgumentException("non-positive value");
          }
      }

      It would be more natural to validate parameters as the first order of business, just as in normal methods:

      public class PositiveBigInteger extends BigInteger {
      
          public PositiveBigInteger(long value) {
              if (value <= 0)
                  throw new IllegalArgumentException("non-positive value");
              super(value);
          }
      }

      Passing Superclass Constructor the Same Parameter Twice

      Sometimes you need to create a single object and pass it to the superclass constructor twice, as two different parameters.

      Today the only way to do that requires adding an extra intermediate constructor:

      public class MyExecutor extends ScheduledThreadPoolExecutor {
      
          public MyExecutor(int corePoolSize) {
              this(corePoolSize, new MyFactoryHandler());
          }
      
          // Extra intermediate constructor we must hop through
          private MyExecutor(int corePoolSize, MyFactoryHandler factory) {
              super(corePoolSize, factory, factory);
          }
      
          private static class MyFactoryHandler
            implements ThreadFactory, RejectedExecutionHandler {
              ...
          }
      }

      A more straightforward implementation might look like this:

      public class MyExecutor extends ScheduledThreadPoolExecutor {
      
          public MyExecutor(int corePoolSize) {
              MyFactoryHandler factory = new MyFactoryHandler();
              super(corePoolSize, factory, factory);
          }
      
          private static class MyFactoryHandler
            implements ThreadFactory, RejectedExecutionHandler {
              ...
          }
      }

      Complex Preparation of Superclass Constructor Parameters

      Sometimes, complex handling or preparation of superclass parameters is needed.

      For example:

      public class MyBigInteger extends BigInteger {
      
          /**
           * Use the public key integer extracted from the given certificate.
           *
           * @param certificate public key certificate
           * @throws IllegalArgumentException if certificate type is unsupported
           */
          public MyBigInteger(Certificate certificate) {
              final byte[] bigIntBytes;
              PublicKey pubkey = certificate.getPublicKey();
              if (pubkey instanceof RSAKey rsaKey)
                  bigIntBytes = rsaKey.getModulus().toByteArray();
              else if (pubkey instanceof DSAPublicKey dsaKey)
                  bigIntBytes = dsaKey.getY().toByteArray();
              else if (pubkey instanceof DHPublicKey dhKey)
                  bigIntBytes = dhKey.getY().toByteArray();
              else
                  throw new IllegalArgumentException("unsupported cert type");
              super(bigIntBytes);
          }
      }

      All of the above examples showing code before super() still adhere to the principle of "intialize the superclass before accessing the new instance" and therefore preserve top down initialization.

      What the JVMS Actually Allows

      Fortunately, the JVMS already grants suitable flexibility to constructors:

      • Multiple invocations of this() and/or super() may appear in a constructor, as long as on any code path there is exactly one invocation
      • Arbitrary code may appear before this()/super(), as long as that code doesn't reference the instance under construction, with an exception carved out for field assignment
      • However, invocations of this()/super() may not appear within a try { } block (i.e., within a bytecode exception range)

      As described above, these more permissive rules still ensure "top down" initialization:

      • Superclass initialization always happens exactly once, either directly via super() or indirectly via this(); and
      • Uninitialized instances are "off limits", except for field assignments (which do not affect outcomes), until superclass initialization is performed

      In fact, the current inconsistency between the JLS and the JVMS is somewhat a historical artifact: the original JVMS was more restrictive as well, however, this led to issues with initialization of compiler-generated fields that supported new language features such as inner classes and captured free variables. As a result, the JVMS was relaxed to accommodate the compiler, but this new flexibility never made its way back up to the language level.

      Fixing a Specification Bug

      JLS §8.1.3 defines static context and notes that "The purpose of a static context is to demarcate code that must not refer explicitly or implicitly to the current instance of the class whose declaration lexically encloses the static context". Going by its stated purpose, a static context would seem to naturally apply to code inside a super() or this() invocation, and in fact the JLS does just that. Prior to the advent of generics, inner classes, and captured free variables, this yielded the correct semantics for superclass construtor invocation.

      However, as §8.1.3 notes a static context prohibits:

      • this expressions (both unqualified and qualified)
      • Unqualified references to instance variables of any lexically enclosing class or interface declaration
      • References to type parameters, local variables, formal parameters, and exception parameters declared by methods or constructors of any lexically enclosing class or interface declaration that is outside the immediately enclosing class or interface

      Those rules make the following program illegal:

      import java.util.concurrent.atomic.AtomicReference;
      
      public class ClassA<T> extends AtomicReference<T> {
      
          private int intval;
      
          public ClassA(T obj) {
              super(obj);
          }
      
          public class ClassB extends ClassA<T> {
              public ClassB() {
                  super((T)null);             // illegal - 'T'
              }
          }
      
          public class ClassC extends ClassA<Object> {
              ClassC() {
                  super(ClassA.this);         // illegal - 'this'
              }
          }
      
          public class ClassD extends ClassA<Integer> {
              ClassD() {
                  super(intval);              // illegal - 'intval'
              }
          }
      
          public static Object method(int x) {
              class ClassE extends ClassA<Float> {
                  ClassE() {
                      super((float)x);        // illegal - 'x'
                  }
              }
              return new ClassE();
          }
      }

      But not only has this program compiled successfully since at least Java 8, the above idioms are in common use. The mental model that the compiler and developers seem to both be using is indeed that code "must not refer explicitly or implicitly to the current instance of the class whose declaration lexically encloses" the code in question. However, this is no longer what a "static context" prohibits; instead, it goes beyond that, for example, restricting even references to generic type parameters.

      The underlying issue is that "static context" is applied to two scenarios which are similar, but not equivalent:

      1. When there is no 'this' instance defined, e.g., within a static method
      2. When the 'this' instance is defined but must not be referenced, e.g., prior to superclass initialization

      The current definition of "static context" is appropriate for scenario #1, but after the addition of generics, inner classes, and captured free variables to the language, no longer for scenario #2. We need a new, distinct concept, which we'll call a "pre-initialization context".

      A "pre-initialization context" is a less restrictive version of "static context" that still disallows accessing the current instance in any way, but doesn't disallow, for example, use of the class' generic type parameters, or accessing the outer instance via the expression Outer.this. This will more accurately match not only the underlying requirement, but also developer expectations, common usage, and the compiler's behavior going back as far as Java 8 (see also JDK-8301649, which this change will effectively fix by codifying the current compiler behavior).

      Description

      Language Changes

      Summary of JLS modifications:

      • Update the grammar to allow statements (other than return) to appear prior to super() or this()
      • Redefine the part of a constructor up to and including the super() or this() call as a "pre-initialization context".
      • Update existing references to "static context" elsewhere in the specification as needed.

      List of specific JLS modifications:

      (1) §8.8.7 "Constructor Body"

      Modify the beginning of this section to read:

      A constructor body may contain an explicit invocation of another constructor of the same class or of the direct superclass (§8.8.7.1).

      ConstructorBody:
          { [BlockStatements] } ;
          { [BlockStatements] ExplicitConstructorInvocation [BlockStatements] } ;

      It is a compile-time error for a constructor to directly or indirectly invoke itself through a series of one or more explicit constructor invocations involving this.

      If a constructor body does not contain an explicit constructor invocation and the constructor being declared is not part of the primordial class Object, then the constructor body implicitly begins with a superclass constructor invocation "super();", an invocation of the constructor of its direct superclass that takes no arguments.

      Except for the possibility of explicit constructor invocations, and the prohibitions on return statements (§14.17), the body of a constructor is like the body of a method (§8.4.7).

      If a constructor body contains an explicit constructor invocation, the BlockStatements preceding the explicit constructor invocation are called the prologue of the constructor body. The BlockStatements in a constructor with no explicit constructor invocation and the BlockStatements following the explicit constructor invocation in a constructor with an explicit constructor invocation are called the main body of the constructor.

      A return statement (§14.17) may be used in the main body of a constructor if it does not include an expression. It is a compile-time error if a return statement appears in the prologue of a constructor body.

      (2) §8.8.7.1 "Explicit Constructor Invocations"

      Modify this sentence that follows the bullet points:

      An explicit constructor invocation statement introduces a pre-initialization context, which includes the prologue of the constructor and the explicit constructor invocation statement, and which prohibits the use of constructs that refer explicitly or implicitly to the current object. These include this or super referring to the current object, unqualified references to instance variables or instance methods of the current object, method references referring to instance methods of the current object, and instantiations of inner classes of the current object's class for which the current object is the enclosing instance (§8.1.3).

      (3) §12.5 "Creation of New Class Instances"

      Replace the numbered steps for constructor processing with the following:

      1. Assign the arguments for the constructor to newly created parameter variables for this constructor invocation.
      2. If this constructor contains an explicit constructor invocation (§8.8.7.1), then execute the BlockStatements of the prologue of the constructor body. If execution of any statement completes abruptly, then execution of the constructor completes abruptly for the same reason; otherwise, continue with step 3.
      3. If this constructor contains an explicit constructor invocation (§8.8.7.1) of another constructor in the same class (using this), then evaluate the arguments and process that constructor invocation recursively using these same six steps. If that constructor invocation completes abruptly, then this procedure completes abruptly for the same reason; otherwise, continue with step 6.
      4. This constructor does not contain an explicit constructor invocation of another constructor in the same class (using this). If this constructor is for a class other than Object, then this constructor contains an explicit or implicit invocation of a superclass constructor (using super). Evaluate the arguments and process that superclass constructor invocation recursively using these same six steps. If that constructor invocation completes abruptly, then this procedure completes abruptly for the same reason. Otherwise, continue with step 5.
      5. Execute the instance initializers and instance variable initializers for this class, assigning the values of instance variable initializers to the corresponding instance variables, in the left-to-right order in which they appear textually in the source code for the class. If execution of any of these initializers results in an exception, then no further initializers are processed and this procedure completes abruptly with that same exception. Otherwise, continue with step 6.
      6. Execute the main body of this constructor. If that execution completes abruptly, then this procedure completes abruptly for the same reason. Otherwise, this procedure completes normally.

      (4) §8.1.3 "Inner Classes and Enclosing Instances"

      • After "A construct (statement, local variable declaration statement, local class declaration, local interface declaration, or expression) occurs in a static context if the innermost:", remove the bullet point "explicit constructor invocation statement".
      • After "which encloses the construct is one of the following:", remove the bullet point "an explicit constructor invocation statement (§8.8.7.1)".
      • Rewrite the second following note as follows:
        • The purpose of a static context is to demarcate code for which there is no current instance defined of the class whose declaration lexically encloses the static context. Consequently, code that occurs in a static context is restricted in the following ways...

      (5) §6.5.6.1 "Simple Expression Names"

      Modify the first bullet point to read:

      • The expression name does not occur in a static context (§8.1.3) or in a pre-initialization context (§8.8.7.1).

      (6) §6.5.7.1 "Simple Method Names"

      • Modify the last sentence in the first paragraph as follows: "The rules also prohibit (§15.12.3) a reference to an instance method occurring in a static context (§8.1.3), a pre-initialization context (§8.8.7.1), or in a nested class or interface..."

      (7) §15.8.3 "this"

      • Change the second bullet point to read "in the main body of a constructor of a class (§8.8.7)".
      • Modify this sentence as follows: "It is a compile-time error if a this expression occurs in a static context (§8.1.3) or in a pre-initialization context (§8.8.7.1)."

      (8) In section §15.11.2 "Accessing Superclass Members using super"

      • Modify this sentence as follows: "It is a compile-time error if a field access expression using the keyword super appears in a static context (§8.1.3) or in a pre-initialization context (§8.8.7.1)."

      (9) In section §15.12.3 "Compile-Time Step 3: Is the Chosen Method Appropriate?"

      • Modify all three of these sentences as follows: "It is a compile-time error if the method invocation occurs in a static context (§8.1.3) or in a pre-initialization context (§8.8.7.1)."

      (10) In section §15.13 "Method Reference Expressions"

      • Modify this sentence as follows: "If a method reference expression has the form super :: [TypeArguments] Identifier or TypeName . super :: [TypeArguments] Identifier, it is a compile-time error if the expression occurs in a static context (§8.1.3) or in a pre-initialization context (§8.8.7.1)."

      (11) In section §15.13.1 "Compile-Time Declaration of a Method Reference"

      • Modify this sentence as follows: "It is a compile-time error if the method reference expression has the form super :: [TypeArguments] Identifier or TypeName . super :: [TypeArguments] Identifier, and the method reference expression occurs in a static context (§8.1.3) or in a pre-initialization context (§8.8.7.1)."

      Records

      Record constructors are subject to more restrictions that normal constructors. In particular:

      • Canonical record constructors may not contain any explicit super() or this() invocation
      • Non-canonical record constructors may invoke this(), but not super()

      These restrictions remain in place, but otherwise record constructors also benefit from these changes. The net result is that non-canonical record constructors may now contain prologue statements before this().

      Testing

      Testing of compiler changes will be done using the existing unit tests, which are unchanged except for those tests that verify changed compiler behavior, plus new positive and negative test cases related to this new feature.

      All JDK existing classes will be compiled using the previous and new versions of the compiler, and the bytecode compared, to verify there is no change to existing bytecode.

      No platform-specific testing should be required.

      Risks and Assumptions

      An explicit goal of this work is to not change the behavior of existing programs. Therefore, other than any newly created bugs, the risk to existing software should be low.

      It's possible that compiling and/or executing newly valid code could trigger bugs in existing code that were not previously accessible.

      Attachments

        Issue Links

          Activity

            People

              acobbs Archie Cobbs
              acobbs Archie Cobbs
              Brian Goetz, Vicente Arturo Romero Zaldivar
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated: