Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8321133

Derived Record Creation (Preview)

    XMLWordPrintable

Details

    • JEP
    • Resolution: Unresolved
    • P4
    • None
    • specification
    • None
    • Gavin Bierman & Brian Goetz
    • Feature
    • Open
    • SE
    • amber dash dev at openjdk dot org

    Description

      Summary

      Enhance the Java language with derived creation for records. Since records are immutable objects, developers frequently create new records to reflect new data; derived creation streamlines code by deriving a new record from an existing record, specifying only the components that are different. This is a preview language feature.

      Goals

      • Provide a concise means to create new record instances derived from existing record values using a block of transformation code.
      • Streamline the declaration of record classes by eliminating the need to provide explicit "wither" methods, which are the immutable analogue of "setter" methods.

      Non-Goals

      • It is not a goal to provide a Pascal-style with construct that simplifies access to arbitrary complex expressions.
      • It is not a goal to provide support for a distinguished class of "wither" methods.
      • It is not a goal to provide derived instance creation expressions for ordinary, non-record class values; this may be the subject of a future JEP.

      Motivation

      Immutability is a powerful technique for creating safe, reliable code that is easy to reason about. Writing immutable classes in Java was traditionally a tedious exercise involving considerable boilerplate, but since JDK 16,

      record<br /> classes

      have made it easy to declare (shallowly) immutable data-centric classes.

      The immutability of record classes gives both safety and predictability, and enables a number of features that make them easy to use, including canonical constructors, accessor methods, and well-defined Object methods. But the systems that we need to model still have state, and we have to model the natural evolution of this state. Unfortunately, it can be quite cumbersome to evolve state modeled by record classes. For example, consider the following record class modeling state which is a location in 3D space:

      record Point(int x, int y, int z) { }

      Suppose we want to evolve the state by doubling the x coordinate of a Point oldLoc, resulting in Point newLoc:

      Point newLoc = new Point(oldLoc.x()*2, oldLoc.y(), oldLoc.z());

      This code, while straightforward, is laborious. Deriving newLoc from oldLoc means extracting every component of oldLoc, whether it changes or not, and providing a value for every component of newLoc, even if unchanged from oldLoc. It would be a constant tax on productivity if developers had to repeatedly deconstruct one record value (extract all its components) in order to instantiate a new record value with mostly the same components. Accordingly, the authors of record classes often hide the deconstruction and instantiation inside the record class by declaring so-called "wither" methods:

      record Point(int x, int y, int z) {
          Point withX(int newX) { 
              return new Point(newX, y, z); 
          }
          Point withY(int newY) { 
              return new Point(x, newY, z); 
          }
          Point withZ(int newZ) { 
              return new Point(x, y, newZ); 
          }
      }

      It is now possible to derive Point newLoc from oldLoc concisely, and to chain method calls in order to change more than one component (ignoring the intermediate values):

      Point newLoc = oldLoc.withX(oldLoc.x()*2);
      
      // Double newLoc's y and z components
      Point nextLoc = newLoc.withY(newLoc.y()*2)
                            .withZ(newLoc.z()*2);

      However, wither methods have two problems. First, they add boilerplate to the record class, which is unfortunate given that record classes aim to eliminate boilerplate such as JavaBean "getters" and "setters". Second, some record classes have semantic constraints that involve more than one component, enforced in the canonical constructor. This means that special care must be taken when writing wither methods. For example, if a record class has two List components that must have the same length, there must not be any wither methods that update only one List component at a time; we must carefully declare a wither method that calls the canonical constructor in such a way that both components are updated in one go. Put another way, wither methods are not always simple boilerplate -- we need to take semantic constraints into account to determine exactly what kind of boilerplate is required.

      A better way to derive new record values from old record values would be to let developers focus on transforming the components, and have the Java compiler handle the deconstruction and instantiation of record values automatically. To achieve this, we extend the Java language with derived instance creation expressions, written e with { ... }. For example:

      Point nextLoc = oldLoc with { 
          x *= 2; 
          y *= 2; 
          z *= 2; 
      };

      The expression to the left of with gives the initial record value (oldLoc). The block to the right of with gives a transformation on the state of the initial record value. The state consists of three local variables (x, y, z) which correspond to the components of the initial record value and are automatically initialized from them, e.g., x would have the value of oldLoc.x(). Evaluating the derived instance creation expression runs the code in the block to transform some or all of x, y, and z, then creates a new record value by passing them to the canonical constructor (new Point(x,y,z)).

      The state of the initial record value is represented by mutable local variables so that the state can be both queried and updated within the block. The ending state is validated by the canonical constructor invoked automatically at the end of the block. This ensures that any cross-component semantic constraints are respected by code in the block. For example, to revisit the record class with two List components of equal length, the canonical constructor can detect if the block appended to one local List variable but not the other.

      The block to the right of with need only mention the components that are transformed. For example:

      Point finalLoc = nextLoc with { x = 0; };

      The point finalLoc has its x component set to zero, but the same y and z components as the point nextLoc. Thus, the common situation of deriving a new record value by making only small changes to an existing record value is expressed succinctly.

      Derived instance creation expressions can be chained. This lets a large transformation be split into a series of smaller steps, aiding readability. For example:

      Point nextLoc = oldLoc 
                  with { x *= 2; } 
                  with { y *= 2; }
                  with { z *= 2; };

      Record values can be nested, where components are themselves record values. Derived instance creation expressions can be nested in order to transform nested record values. For example:

      record Marker(Point loc, String label, Icon icon) { }
      
      Marker m = new Marker(new Point(...), ..., ...);
      Marker scaled = m with { loc = loc with { x *= 2; y *= 2; z *= 2; }};

      Derived instance creation expressions can be used in record classes to simplify the implementation of basic operations. For example:

      record Complex(double re, double im) {
          Complex conjugate() { return this with { im = -im; }; }
          Complex realOnly()  { return this with { im = 0; }; }
          Complex imOnly()    { return this with { re = 0; }; }
      }

      In summary, derived instance creation expressions are an important part of the programming model for record classes. Since record classes are immutable by design, we need language-level support to update their components when the underlying state changes. Using with to derive new record values is good for readability, since it focuses only on the changed state, and good for correctness, since it engages the canonical constructor automatically.

      Description

      A derived instance creation expression provides a succinct way to create a new record value that is derived from an existing record value.

      The grammar for derived instance creation expressions is as follows:

      DerivedInstanceCreationExpression:\    Expression with Block

      The expression on the left-hand side is known as the origin expression. The type of the origin expression must be a record class type which is then taken to be the overall type of the derived instance creation expression. The block on the right-hand side is known as the transformation block. It is a normal Java block containing arbitrary block statements, but it has a couple of control flow restrictions:

      1. It may not contain a return statement.
      2. It may not contain a yield, break, or continue statement whose target contains the derived instance creation expression.

      (In other words, it is not possible to transfer control out of a transformation block other than by completing normally or completing abruptly because an exception has been thrown.)

      A number of local variables are declared implicitly immediately before the contents of any transformation block. For each record component, if any, in the header of the record class named by the type of the origin expression, a local variable with the same name and type is declared. These variables, known collectively as the local component variables, are initialized with the corresponding component values of the origin expression. Local component variables can shadow any local variables whose declarations are in scope.

      Any assignment statements that occur within the transformation block have the following constraint: If the left-hand side of the assignment is an unqualified name, that name must be either (i) the name of a local component variable, or (ii) the name of a local variable that is declared explicitly in the transformation block.

      The transformation block need only express the parts of the state being modified. If the transformation block is empty then the result of the derived instance creation expression is a copy of the value of the origin expression (the expression on the left-hand side).

      A derived instance creation expression is not a statement expression.

      A derived instance creation expression is evaluated as follows:

      1. Evaluate the origin expression (whose compile-time type names a record class R) to yield a value known as the origin value. If evaluation completes abruptly, evaluation of the derived instance creation expression completes abruptly for the same reason.

      2. If the origin value is null then evaluation of the derived instance creation expression completes abruptly with a NullPointerException.

      3. Before executing the contents of the transformation block, a number of implicit local variable declaration statements are executed. These local variable declaration statements are derived from each record component in the header of the record class R, in order, as follows:

        • The local variable declaration has the same name and declared type as the record component.

        • The local variable declaration has an initializer which is given as if by invoking the corresponding component accessor method on the origin value.

        If evaluation of any of these local variable declaration statements completes abruptly, evaluation of the derived instance creation expression completes abruptly for the same reason.

      4. The contents of the transformation block are executed. If execution completes abruptly, evaluation of the derived instance creation expression completes abruptly for the same reason.

      5. A new instance of record class R is created as if by evaluating a new class instance creation expression (new) with the compile-time type of the origin expression and an argument list containing the local component variables, if any, in the order that they appear in the header of record class R. (This ensures that the canonical constructor will be invoked to create an instance of the record class.) The resulting instance of the record class R is taken as the overall value of the derived instance creation expression. If evaluating the new class creation expression completes abruptly, evaluation of the derived instance creation expression completes abruptly for the same reason.

      The use of a derived instance creation expression:

      Point nextLoc = oldLoc with { 
          x *= 2; 
          y *= 2; 
          z *= 2; 
      };

      can be thought of a switch expression:

      Point nextLoc = switch (oldLoc) {
          case Point(var x, var y, var z) -> {
              x *= 2; 
              y *= 2; 
              z *= 2; 
              yield new Point(x, y, z);
          }
      };

      This makes it clear that the semantics of a derived instance creation expression involves pattern matching the value of oldLoc against a record pattern -- "deconstruction" that initializes the pattern variables x, y, and z -- and creating a new instance of Point by invoking the canonical constructor with the final values of the pattern variables as the arguments, having first executed the three statements that appear in the transformation block.

      Note that if the canonical constructor enforces constraints (as it should), then a derived instance creation expression will implicitly enforce them when constructing the new record value. For example:

      record Rational(int num, int denom) {
          Rational {
              if (denom == 0)
                  throw new IllegalArgumentException("denom must not be zero");
          }
      }
      
      Rational r = new Rational(3, 1);  // OK

      If we execute the statement:

      Rational s = r with { denom = 0; };  // Not OK

      we will get an IllegalArgumentException. Evaluating the derived instance creation expression on r will initialize local variable num to 3 and denom to 1, then assign denom to 0, then invoke the canonical constructor via new Rational(num, denom). This will throw as if we had tried to evaluate new Rational(3, 0) explicitly.

      The structure and behavior of the transformation block in a derived instance creation expression is similar to the body of a compact constructor in a record class. Both have the same control flow restrictions (must complete normally or throw an exception); both have a set of pre-initialized variables in scope, which are expected to be mutated by the block; and both take the final values of those variables and pass them as arguments to a constructor invocation.

      Alternatives

      • Instead of supporting an expression form for use-site creation of new record values, we could support it at the declaration site with some form of special support for wither methods. We prefer the flexibility of use-site creation, whereas declaring wither methods would add bloat to record class declarations, which currently enjoy a high degree of succinctness.

      • Instead of supporting transformation blocks, we could instead support only a list of variable declarators on the right-hand side of a derived instance creation expression, e.g., e with { x1 = e1; ..., xn = en; }. This would serve to discourage impure code, but would likely be cumbersome and unnecessarily restrictive.

      Attachments

        Activity

          People

            gbierman Gavin Bierman
            gbierman Gavin Bierman
            Gavin Bierman Gavin Bierman
            Alex Buckley, Brian Goetz
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: