Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8303099

Null-Restricted and Nullable Types (Preview)

XMLWordPrintable

    • Icon: JEP JEP
    • Resolution: Unresolved
    • Icon: P3 P3
    • None
    • tools
    • None
    • Feature
    • Open
    • SE
    • valhalla dash dev at openjdk dot org
    • L
    • M

      Summary

      Support nullness markers on Java types to indicate that a type rejects or deliberately allows nulls. This is a preview language feature.

      Goals

      • Enhance Java's reference types to let programmers express whether null references are expected as values of the type

      • Support conversions between types with different nullness properties, accompanied by warnings about possibly-mishandled null values

      • Compatibibly interoperate with traditional Java code that makes no assertions about the null compatibility of its types, and support gradual adoption of these new features without introducing source or binary incompatibilities

      • Ensure that variables with types that reject null are initialized before they are first read

      • Enforce types that reject null at run time, even when classes are compiled separately

      • Provide the metadata and integrity guarantees necessary for run-time optimizations (such as the flattening of value objects) to rely on types that claim to exclude null

      Non-Goals

      • It is not a goal to automatically re-interpret existing code—use of these features should be optional and explicitly opted in to (future work will explore mechanisms to request a bulk opt-in without needing to change individual types)

      • It is not a goal to require programs to explicitly account for all null values that might occur; unaccounted-for null values may cause compile-time warnings, but not compile-time errors

      • It is not a goal to make any changes to the primitive types, such as adding support for a nullable int type

      • It is not a goal (at this time) to apply the language enhancements to the standard libraries

      Motivation

      In a Java program, a variable of type String may hold either a reference to a String object or the special value null. In some cases, the author intends that the variable will always hold a reference to a String object; in other cases, the author expects null as a meaningful value. Unfortunately, there is no way to formally express in the language which of these alternatives is intended, leading to confusion and bugs.

      For example, programs often make a blanket assumption that no null values will be present. But it takes extra care to state this expectation in Javadoc specifications and to reliably enforce it in implementation code. If extra care isn't taken, and someone fails to follow the assumed protocol, then null values may end up flowing freely through implementation code, eventually triggering an exception at some point far removed from the bug.

      This situation can be greatly improved by giving developers tools to assert, as part of a type, that either (1) null values are not supported and will be rejected, or (2) null values are expected and should be properly accounted for.

      By default, the language can't reasonably assume either of these interpretations. It needs developers to explicitly indicate their intent.

      Given a clear expression of intent, the language could then introduce both compile-time feedback and run-time checks to help developers detect unexpected nulls earlier.

      In the Valhalla project, a variable of a value class type can be optimized with a flattened representation of its values. But this flattened representation may need extra bits to encode null, negatively impacting memory footprint and sometimes making it impossible to optimize the storage at all. If the developer could exclude null from the domain of the variable, a better encoding could be achieved.

      In the Amber project, the nullness of a pattern match candidate may influence whether a switch should be considered exhaustive, and the nullness of a type pattern may influence whether the pattern matches null. It would be useful for developers to be able to control these behaviors.

      Description

      The features described below are preview features, enabled with the --enable-preview compile-time and runtime flags.

      Nullness properties and markers

      A reference type may optionally express nullness—whether null is intended to be included in the value set of the type.

      In Java syntax, a nullness marker is used to indicate this property.

      The type Foo! is null-restricted: the value set excludes null.

      The type Foo? is nullable: the value set deliberately includes null.

      By default, the nullness of the type Foo is unspecified: a null may occur, but we don't know whether its presence is deliberate.

      The nullness of a type is an intrinsic part of the type—in other words, Foo? and Foo are different types, because they have different nullness. However, as outlined later, most language rules are defined in a way that either ignores nullness or helpfully adapts (perhaps with a warning) between types with different nullness.

      Array types and their array component types may both have nullness markers. Foo?[]! is a null-restricted array type, whose components are of nullable type Foo. Null markers for multi-dimensional arrays may occur after each bracket pair, and by convention are interpreted outermost to innermost, from left to right.

      Parameterized types and their type arguments may similarly both have nullness markers. Predicate!<Foo?> is a null-restricted Predicate with nullable type Foo as a type argument. The interpretation of type arguments is described in more detail later.

      In this JEP, nullness markers are explicit: to express a null-restricted or nullable type, the ! or ? symbol must appear in source. In the future, it may be useful, for example, for a programmer to somehow express that every type in a class or compilation unit is to be interpreted as null-restricted, unless a ? symbol is present. The details of any such a feature will be explored separately.

      Field and array initialization

      Most variables in Java must be initialized with a value of the variable's type before they can be used. Local variables are checked for definite assignment before they can be referenced; method parameters get initial values from a method invocation expression; pattern variables are bound in the process of pattern matching; etc.

      Traditionally, fields and array components get special treatment: because they can be accessed by multiple program components immediately upon creation of a class or object, they are automatically initialized "at birth" with a default value, which programmers will typically overwrite in the course of program execution.

      The default value of a reference type is null. But this is an unsuitable initial value for a null-restricted field or array component—if someone reads the variable before it has been written, they will observe a value that is not of the variable's type.

      Thus, fields and arrays with null-restricted types behave differently than other fields and arrays: they must always be initialized by the program before they can be read. This is enforced as follows:

      • A null-restricted instance field without an initializer must be definitely assigned before the (explicit or implicit) super(...) call in each of the class's constructors. The Flexible Constructor Bodies JEP allows the necessary initialization code to be written at the start of a constructor. In this early construction context, the initialization logic is not allowed to refer to this or risk any attempts to read the uninitialized field.

        class Person {
        private String! name;
        
        public Person(String name) {
          this.name = name;
          super();
        }
        }
      • If a null-restricted instance field has an initializer, the initializer is executed at the start of each constructor, before the super(...) call. (Constructors that call this(...) are a special case and, as usual, do not execute initializers at all.) Again, this means that the initialization logic of the field occurs in an early construction context and may not refer to this or risk any reads of the uninitialized field.

      • A null-restricted static field must be definitely assigned by the end of all static initializers and initializer blocks of the class. Note, however, that this rule does not prevent some other class from trying to read the field during the class initialization process; in that case, a run time check detects the early read attempt and throws an exception.

        In the following example, the initialization code of field Foo.s has a circular dependency. Traditionally, the circular reference to Foo.s would produce the default value of s, null; with a null-restricted field type, that is impossible and so an exception is thrown alerting the developer to the bug.

        class Foo {
        public static String! s = Bar.getString();
        }
        
        class Bar {
        static String! getString() {
            return Foo.s; // may throw an exception
        }
        }
      • An array with a null-restricted component type must provide an initializer for each component in the array creation expression. This can be achieved by explicitly listing each initial value in an array initializer, or by using a new shorthand form (syntax TBD).

        String![] labels;
        labels = new String![]{ "x", "y", "z" }; labels = new String![100]{ "" }; // strawman syntax labels = new String![100]{ i -> "x"+i }; // strawman syntax

      Expression nullness and conversions

      As part of type checking, the Java compiler is responsible for determining the nullness of every expression.

      • The nullness of a variable reference is given by the referenced variable's declaration (but TBD whether the Java compiler will further observe that, due to previous uses of the variable, a null is known not to be present).

      • The nullness of a method invocation is given by the referenced method's return type.

      • The type of a cast expression is explicit in the cast (but, again, TBD whether the Java compiler takes other information into account).

      • A null literal is nullable (of course).

      • Most other reference-typed expressions are null-restricted. These include literals, string concatenations, this, class instance and array creations, method references, and lambda expressions.

      A nullness conversion allows an expression with one kind of nullness to be treated as having a different nullness. Nullness conversions are permitted in all assignment, invocation, and casting contexts.

      The following are widening nullness conversions:

      • Foo! to Foo?
      • Foo! to unspecified Foo
      • Foo? to unspecified Foo
      • unspecified Foo to Foo?

      And these are narrowing nullness conversions: - Foo? to Foo! - unspecified Foo to Foo!

      Narrowing nullness conversions are analogous to unboxing conversions: the compiler performs them automatically, while at run time they impose a dynamic check, possibly causing a NullPointerException.

      In fact, the boxing and unboxing conversions can now be simplified: boxing converts int to Integer!, potentially followed by a widening nullness conversion to Integer; unboxing converts Integer! to int, possibly preceded by a narrowing nullness conversion from Integer to Integer!.

      It is a compile-time error to attempt to directly convert a null literal to a null-restricted type.

      Run-time null checking

      At run time, if a null value undergoes a narrowing nullness conversion to a null-restricted type, a NullPointerException is thrown.

      String? id(String! arg) { return arg; }
      
      String s = null;
      Object! o1 = s; // NPE
      Object o2 = id(s); // NPE
      Object o3 = (String!) s; // NPE

      Some narrowing nullness conversions are not apparent in the source code, but occur implicitly as part of run time execution. These include:

      • An array that was allocated with a null-restricted component type may be given a less specific type in the source code, but will still reject null values during the usual array store check. The failed conversion will prompt an ArrayStoreException.

      • Similary, a field that was not null-restricted at compile time but was later separately compiled to be null-restricted will reject null values during a new field store check. The failed conversion will prompt a FieldStoreException.

      • When one method overrides another, the argument to an invocation of the superclass method undergoes conversion to the parameter's invocation type, followed by conversion to the type of the overriding method's parameter. As discussed below, the nullness of these two parameter types may not be the same.

      • Similarly, the return value of a method undergoes conversion to the declared method return type, followed by conversion to the invocation's expected return type.

      Nullness of type variables

      Like other types, a type-variable type (that is, a use of a type variable) may express nullness. T! is a null-restricted type, and T? is a nullable type.

      Null-restricted and nullable type variable types (T! and T?) assert a specific nullness within the generic code. This may be appropriate if the generic code directly interacts with null.

      class Box<T> {
          boolean set;
          T? val; // nullable field
      
          public Box() { set = false; }
          public void set(T val) { this.val = val; set = true; }
      
          public T? getOrNull() { // nullable result
              return set ? val : null;
          }
      
          public T! getNonNull(T! alt) { // null-restricted result
              return (set && val != null) ? (T!) val : alt;
          }
      }

      Types used as type arguments may express nullness; null markers on type-variable types override whatever nullness was asserted by the type argument.

      Box<String!> b1 = new Box<String!>();
      b1.getOrNull(); // nullable result
      
      Box<String?> b2 = new Box<String?>();
      b2.set(null);
      b2.getNonNull(""); // null-restricted result

      Of course, null restrictions cannot be enforced within the erased implementation of a generic API. But the usual implicit casts that occur at the boundaries of generic APIs will enforce null-restricted type arguments at run time.

      Type arguments and bounds

      As illustrated above, type arguments may express nullness, which influences the substituted nullness of the API wherever parametric type variables occur.

      For interoperability, nullness in type arguments is not strongly enforced, and unchecked nullness conversions allow modifications to the nullness of type arguments. For example, Predicate<String!> can be converted to Predicate<String> or Predicate<String?>. These conversions may cause warnings (see "Compiler warnings", below).

      Similarly, unchecked nullness conversions allow modifications to the nullness of array component types. TBD under what conditions these conversions are checked at run time.

      A type variable declaration or wildcard may have nullness markers on its bounds. A type may satisfy the bounds via nullness conversion, though, so again these nullness markers are not strongly enforced, but may cause warnings.

      Method overriding and type argument inference

      Nullness is ignored when determining whether two methods have the same signature. One method may override another even if the nullness of their parameters and returns do not match.

      class A {
          String? lookup(String! arg) { ... }
      }
      
      class B extends A {
          String lookup(String arg) { ... }
      }

      Such mismatches will be common as different APIs adopt nullness markers independently.

      Formally, two methods are considered to have the same signature if each parameter type and type parameter bound can be converted to the other via nullness and unchecked conversions.

      Similarly, the return type of an overriding method must be convertible to the overridden return type via a widening reference conversion, possibly followed by nullness and unchecked conversions.

      If a method is generic, any parametric uses of its type parameters in the method signature will influence the inferred nullness of type arguments. Nullness does not influence method applicability and cannot cause type argument inference to fail, but it can influence the nullness inferred for the method's return type. (Details of the inference algorithm TBD.)

      Compiler warnings

      As described above, making a type null-restricted may cause new compile-time errors if a field or array of the type is left uninitialized, or if an attempt is made to convert a null literal to the type. It may also be a compile-time error to compare a null literal to an expression with a null-restricted type.

      In other situations, nullness analysis is supplementary and does not cause compile-time errors. However, javac will provide warnings to help programmers avoid runtime errors. IDEs and other analysis tools are encouraged to do the same. Possible sources of warnings include:

      • Narrowing nullness conversions, especially from ? types

      • ?-typed expressions used in member accesses or other null-hostile operations

      • Type arguments whose nullness is inconsistent with their bounds

      • Method parameters or returns with nullness that doesn't match an overridden method

      • Unchecked conversions that change the nullness of a type

      Compilation & class file representation

      Most uses of null markers are erased in class files, with the accompanying run-time conversions being expressed directly in bytecode.

      Signature attributes have an updated grammar to allow ! and ? in types, as appropriate. Nullness is not encoded in method and field descriptors.

      However, to prevent pollution of fields, a new NullRestricted attribute allows a field to indicate that it does not allow null values. This has the following effects:

      • The field must also be marked ACC_STRICT, which indicates that it must be "strictly-initialized". The verifier ensures that all strictly-initialized instance fields have been assigned to at the point that a constructor makes a super(...) call.

      • All attempts to write to the field check for a null value and, if found, throw a FieldStoreException.

      Null-restricted array creation is not supported by the anewarray instruction, and must be accomplished with a call to the reflection API (see below). All attempts to write to a null-restricted array component reject null values during the usual array store check.

      Core reflection

      There are no Foo!.class or Foo?.class literals, and no associated instances of java.lang.Class. These types are derived from a class declaration, but do not represent distinct classes. (Compare List<String> and List<Integer>.)

      However, a new RuntimeType API describes the set of types that are enforced by array and field store checks at run time, including a null-restricted variant of every class and interface type. (This is a superset of the linkage types, which can appear in descriptors and are represented by the Class API.)

      The Field API supports querying a field's RuntimeType, which may not be the same as its getType result.

      The Array API supports variations of newInstance that allow a component type to be expressed with a RuntimeType. These variations also allow the caller to provide initial values for the array components, and will reject attempts to create null-restricted arrays without initial values. Another new method reflects the RuntimeType of an array's components.

      Supplementary changes

      Traditional deserialization is not compatible with null-restricted fields and arrays. A separate JEP will provide a new mechanism to support serialization without exposing uninitialized null-restricted fields and arrays.

      Documentation generated by javadoc will include nullness markers.

      The java.lang.reflect.Type and javax.lang.model APIs will encode nullness in their representation of types.

      Alternatives

      A variety of development tools in the Java ecosystem have implemented their own compile-time tracking of nulls. These tools don't change the Java language and so naturally have some limitations, particularly in the syntax they can use (annotations) and the behavior they can affect (compile-time checks).

      Other programming languages track nullness in their type systems. Many are null-restricted by default. Many also consider it an error to assign to a null-restricted type without an explicit null check. In the case of Java, it's important that the feature be optional, and be something that programmers can incrementally make use of without a monolithic migration effort.

      Runtime enforcement of nullness can be implemented with explicit checks or calls to the Objects.requireNonNull standard API. But consistently applying these checks is tedious, necessitates additional documentation work, and makes programs harder to read. There's no way to apply these checks directly to variable storage, particularly fields and arrays.

      Dependencies

      Prerequisites:

      • Flexible Constructor Bodies (Second Preview) allows constructors to execute statements before a super(...) call and allows assignments to instance fields in this context. These changes facilitate the initialization requirements of null-restricted fields.

      Future work:

      • Null-Restricted Value Class Types (Preview) will optimize the encodings of null-restricted, value-class-typed fields and arrays, and may allow some value classes to declare their own default values.

      • JEP 402: Enhanced Primitive Boxing (Preview) will track nullness as it makes wider use of implicit boxing conversions in the language.

      • JVM class and method specialization (JEP 218, with revisions) will allow generic classes and methods to reify and enforce the nullness of (at least some of) their type arguments.

      Other possible future enhancements building on this JEP may include:

      • Applying nullness markers to certain parts of the standard APIs.

      • Enhancing the JVM to provide a concise, minimal-footprint way to express a null check in bytecode.

      • Enhancing the JVM to provide stronger low-level enforcement of null-restricted method parameters.

      • Providing a mechanism in the language to assert that all types in a certain context are implicitly null-restricted, without requiring the programmer to use explicit ! symbols.

            dlsmith Dan Smith
            dlsmith Dan Smith
            Dan Smith Dan Smith
            Votes:
            1 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated: