Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8288476

Primitive types in patterns, instanceof, and switch (Preview)

    XMLWordPrintable

Details

    • JEP
    • Status: Submitted
    • P2
    • Resolution: Unresolved
    • None
    • specification
    • None
    • Angelos Bimpoudis
    • Feature
    • Open
    • SE
    • amber dash dev at openjdk dot org
    • M
    • M

    Description

      Summary

      Enhance pattern matching by allowing primitive type patterns to be used in all pattern contexts, align the semantics of primitive type patterns with that of instanceof, and extend switch to allow primitive constants as case labels. This is a preview language feature.

      Goals

      • Enable uniform data exploration by allowing type patterns to match values of any type, whether primitive or reference.

      • Align primitive type patterns with safe casting.

      • Allow pattern matching to use primitive type patterns in both nested and top-level contexts.

      • Provide easy-to-use constructs that eliminate the risk of losing information due to unsafe casts.

      • Following the enhancements to switch in Java 5 (enum switch) and Java 7 (string switch), allow switch to process values of any primitive type.

      Non-Goals

      • It is not a goal to introduce new types of conversions or new conversion contexts.

      Motivation

      Records and record patterns work together to streamline data processing. Records (JEP 395) make it easy to aggregate components, and record patterns (JEP 440) make it easy to decompose aggregates using pattern matching.

      In this example, we represent JSON documents with a sealed hierarchy of records:

      sealed interface JsonValue {
          record JsonString(String s) implements JsonValue { }
          record JsonNumber(double d) implements JsonValue { }
          record JsonNull() implements JsonValue { }
          record JsonBoolean(boolean b) implements JsonValue { }
          record JsonArray(List<JsonValue> values) implements JsonValue { }
          record JsonObject(Map<String, JsonValue> map) implements JsonValue { }
      }

      JSON does not distinguish integers from non-integers, so in JsonNumber we represent all numbers with double values as recommended by the specification.

      Given a JSON payload of

      { "name" : "John", "age" : 30 }

      we can construct a corresponding JsonValue via

      var json = new JsonObject(Map.of("name", new JsonString("John")
                                       "age", new JsonNumber(30)));

      For each key in the map, this code instantiates an appropriate record for the corresponding value. For the first, the value "John" has the same type as the record's component, namely String. For the second, however, the Java compiler applies a widening primitive conversion to convert the int value, 30, to a double.

      Nested primitive type patterns are limited

      We can, of course, use record patterns to disaggregate this JSON value:

      record Customer(String name, int age) { }
      
      if (json instanceof JsonObject(var map)
          && map.get("name") instanceof JsonString(String name)
          && map.get("age") instanceof JsonNumber(double age))
      {
          return new Customer(name, (int)age);    // unavoidable cast
      }

      Here we see that primitive type patterns in nested contexts have a limitation: In this application we expect the age value always to be an int, but from the JsonNumber pattern we can only extract a double and must rely upon a lossy manual cast to convert that to an int. We should return a Customer object only when the age value is representable as an int, which requires additional code:

      if (json instanceof JsonObject(var map)
          && map.get("name") instanceof JsonString(String name)
          && map.get("age") instanceof JsonNumber(double age))
      {
          int age2 = (int)age;                    // unavoidable cast
          if (age2 == age)
              return new Customer(name, age2);
      }

      When we constructed the JsonObject, we were able to pass an int to the constructor where it expected double. But, when we disaggregate the JsonObject with a record pattern, we have to bind a double value, and then manually cast it to int. What we would really like to do is use int directly in the JsonNumber pattern so that the pattern matches only when the double value inside the JsonNumber object can be converted to an int without loss of information, and when it does match it automatically narrows the double value to an int:

      if (json instanceof JsonObject(var map)
          && map.get("name") instanceof JsonString(String name)
          && map.get("age") instanceof JsonNumber(int age))
      {
          return new Customer(name, age);         // no cast!
      }

      This sort of usage is characteristic of pattern matching's ability to reject illegal values automatically. Pattern matching eliminates the need for verbose and potentially unsafe casts by raising match failures to control-flow decisions.

      Unfortunately the above example does not work today because primitive type patterns are invariant: The type of the component being matched must be identical to the type in the primitive type pattern. Thus we cannot write JsonNumber(int age) because the component of the JsonNumber record class was declared to be a double. That is not the case for reference types; for example:

      record Box(Object o) { }
      Box b = new Box(new RedBall());
      if (b instanceof Box(RedBall r)) { ... }

      Here the pattern Box(RedBall r) matches only when b is a Box that holds a RedBall, in which case it binds the local variable r of type RedBall to that object. This all works even though the component of the Box record class was declared to be an Object rather than a RedBall.

      Primitive type patterns should not mean something different from reference type patterns; they should both mean that the value can be cast safely, without loss of information, and they should both allow matching and binding to a type other than the record's original component type when sensible.

      Primitive type patterns are not permitted in top-level contexts

      Primitive type patterns are useful outside of record patterns, but at present they are not permitted in the top levels of patterns.

      For example, suppose we want to process an int value only when it can be cast safely to a byte. Today we can write an explicit range test:

      void processByte(byte b) { ... }
      
      if (i >= -128 && i <= 127) {
          processByte((byte)i);
      }

      Alternatively, we could use round-trip casts:

      if ((int)(byte)i == i) {
          processByte((byte)i);
      }

      If we could use primitive types in top-level patterns then we could instead write simply:

      if (i instanceof byte b) {
          processByte(b);
      }

      Here the instanceof operator guarantees that, when it returns true, the int value i has been cast safely to a byte and bound to the variable b without any loss of information, i.e., sign or magnitude. In other words, it acts as a safeguard.

      Top-level primitive type patterns could also be used to guard against lossy assignments, which have been a silent risk since Java 1.0. For example, we could rewrite

      int getPopulation() { ... }
      float pop = getPopulation();    // possible loss of information!

      more safely as

      if (getPopulation() instanceof float pop) {
          ... pop ...
      }

      Primitive types are not permitted in type comparisons

      Sometimes we only need to use instanceof as a type comparison operator, without binding a pattern variable. For example, today we can write

      if (o instanceof String) {
         ... (String)o ...
      }

      to test that the object denoted by o is indeed a String, so that the cast (String)o will never fail and throw a ClassCastException. We should, similarly, be able to write

      if (i instanceof byte) {
          ... (byte)b ...
      }

      to test that the value denoted by i can be represented as a byte, so that the cast (byte)b will never fail. When casting to a primitive type, failure does not mean throwing an exception but, rather, losing information; if, e.g., the value of i were 1000 (0x3e8) then (byte)i would evaluate to -24 (0xe8). It would be ideal, as a result, to make the meaning of instanceof byte consistent with the pattern match instanceof byte b.

      More generally, type comparisons using instanceof should work on all types, recognizing the deep connection between instanceof and casting. An instanceof test should succeed only when a casting conversion exists from the type of the left-hand operand to the type on the right-hand side, and only when that conversion can be performed without loss of information, i.e., sign, magnitude, or precision.

      Primitive type patterns in switch

      At present, primitive type patterns are not allowed in the top-level context of the instanceof operator, nor are they allowed in the top-level contexts of the case labels of a switch. For example, with a top-level primitive type pattern we could rewrite the switch expression

      switch (x.getStatus()) {
          case 0 -> "okay";
          case 1 -> "warning";
          case 2 -> "error";
          default -> "unknown status: " + x.getStatus();
      }

      more clearly as

      switch (x.getStatus()) {
          case 0 -> "okay";
          case 1 -> "warning";
          case 2 -> "error";
          case int i -> "unknown status: " + i;
      }

      Here the case int i label matches any status value not previously matched, making the switch expression exhaustive so that no default label is required.

      Permitting top-level primitive type patterns would allow guards to be used to further restrict the values matched by case labels:

      switch (x.getYearlyFlights()) {
          case 0 -> ...;
          case 1 -> ...;
          case 5 -> issueDiscount();
          case int i when i > 100 -> issueGoldCard();
          case int i -> ...;
      }

      Combining primitive type patterns and record patterns facilitates further opportunities for case analysis when combined with record patterns:

      switch (x.order()) {
          case NormalOrder(Product(int productCode)) -> ...;
          case BadOrder x -> switch (x.reason()) {
              case MissingProduct q -> switch (q.code()) {
                  case 1     -> ...;
                  case 2     -> ...;
                  case int i -> ...;
              }
          }
      }

      switch does not support all primitive types

      At present, switch expressions and switch statements can switch on values of the primitive types byte, short, char, and int — but not boolean, float, double, or long. We can switch on a long value only when it fits within an int, so we must handle any remaining cases with if statements:

      long v = ...;
      if (v == (int)v) {
          switch ((int)v) {
              case 0x01  -> ...;
              case 0x02  -> ...;
              case int i -> ... i ...;
          }
      }
      
      if (v == 10_000_000_000L) { ... }
      if (v == 20_000_000_000L) { ... }

      If we could use long constant expressions in case labels then we could instead write:

      long v = ...;
      switch (v) {
          case 0x01            -> ...;
          case 0x02            -> ...;
          case 10_000_000_000L -> ...;
          case 20_000_000_000L -> ...;
          case long l          -> ... l ...;
      }

      Similarly, consider code that uses if-else chains to test float values:

      float f = ...;
      if (Float.isNaN(f)) {
          ...
      } else if (Float.isInfinite(f)) {
          ...
      } else {
          ...
      }

      With float values in case labels we could declutter this into:

      float f = ...;
      switch (f) {
          case Float.NaN    -> ...;
          case Float.POSITIVE_INFINITY -> ...;
          case Float.NEGATIVE_INFINITY -> ...;
          case float g -> ... g ...;
      }

      Switching on boolean values could be a useful alternative to the ternary conditional operator (?/:). Unlike that operator, a boolean switch expression can contain both expressions and statements in its rules. For example:

      startProcessing(OrderStatus.NEW, switch (user.isLoggedIn()) {
          case true  -> user.id();
          case false -> { log("Unrecognized user"); yield -1; }
      });

      Here the second argument to the startProcessing method uses a boolean switch to encapsulate some business logic.

      When switching on a primitive value, a switch expression or statement should automatically convert between the type of that value and the types of its case labels — as long as those conversions do not lose information. For example, when switching on a float value the case labels could be of type float, double, int, or long as long as the constant value of each label converts sensibly to a float.

      float f = ...;
      switch (f) {
          case 16_777_216 -> ...;
          case 16_777_217 -> ...;    // error: duplicate label
          default -> ...;
      }

      This switch accepts a float but its case labels are integral values that convert to the same float value. The cases are indistinguishable at run time, so this code should be rejected at compile time.

      In summary, primitive types in instanceof, and in type patterns for instanceof and switch, would increase program reliability and enable more uniform data exploration with pattern matching.

      Description

      We propose to define the semantics of all type patterns in terms of the instanceof type comparison operator. Henceforth, instanceof will not only be able to test and compare reference types but also safeguard any casting conversion. As a result, instanceof will be able to test whether a value can be cast safely to a target type, i.e., without throwing a ClassCastException, without throwing a NullPointerException, and, in the case of primitive types, without losing information (sign, magnitude, or precision).

      Having done that, we then need only to lift the remaining restrictions on primitive types in type patterns and in switch blocks in order to achieve our goals.

      instanceof as the precondition test for safe casting conversions

      As of Java 16, the instanceof operator is either a type comparison operator (e.g., o instanceof String) or a pattern match operator (e.g., o instanceof String s), depending on its syntactic form.

      To enable primitive types to be used with the instanceof type comparison operator we remove the restrictions that (1) the type of the left-hand operand must be a reference type, and (2) the right-hand operand must name a reference type. The form of type comparison expressions thus becomes:

      InstanceofExpression:
          RelationalExpression instanceof Type
          ...

      At present, the result of a type comparison expression x instanceof T is false if x denotes the null reference, true if x can be cast to the reference type T without raising a ClassCastException, and false otherwise. We generalize the semantics of such expressions to test whether x can be converted exactly to the given primitive or reference type T in a casting context (JLS §5.5) without loss of information. It remains a compile-time error to use instanceof if no cast conversion exists from the static type of x to the type T. Under this generalization, the instanceof type comparison operator works for all pairs of types that can be converted in a casting context.

      The examples given earlier rely on conversions allowed in a casting context, so they can be rewritten to use instanceof directly:

      int i = 1000;
      if (i instanceof byte) {     // false
        byte b = (byte)i;
        ... b ...
      }
      
      byte b = 42;
      if (b instanceof int) {      // true
        int i = (byte)b;
        ... i ...
      }
      
      int i = 16_777_216;          // 2^24
      if (i instanceof float) {    // true
        float f = (float)i;
        ... f ...
      }
      
      int i = 16_777_217;          // 2^24+1
      if (i instanceof float) {    // false
        float f = (float)i;
        ... f ...
      }

      We do not add any new conversions to casting contexts, nor do we create any new conversion contexts. Whether instanceof is applicable to a given expression and type is determined solely by whether a conversion is allowed by the casting context. The conversions (JLS §5.1) permitted in casting contexts are:

      • Identity conversions,
      • Widening primitive conversions,
      • Narrowing primitive conversions,
      • Widening and narrowing primitive conversions,
      • Boxing conversions, and
      • Unboxing conversions

      as well as specified combinations of:

      • An identity conversion,
      • A widening reference conversion,
      • A widening reference conversion followed by an unboxing conversion,
      • A widening reference conversion followed by an unboxing conversion and then a widening primitive conversion,
      • A narrowing reference conversion,
      • A narrowing reference conversion followed by an unboxing conversion,
      • An unboxing conversion, and
      • An unboxing conversion followed by a widening primitive conversion.

      Consider the following examples. All of these are allowed because the left-hand operand of the instanceof operator can be converted, in a casting context, to the type specified by the right-hand operand:

      int i = ...
      i instanceof byte
      i instanceof float
      
      boolean b = ...
      b instanceof Boolean
      
      Short s = ...
      s instanceof int
      s instanceof long
      
      long l = ...
      l instanceof float
      l instanceof double
      
      Long ll = ...
      ll instanceof float
      ll instanceof double

      However, all of the following examples raise a compile-time error, since they do not correspond to an existing casting conversion:

      boolean b = ...
      b instanceof char       // error
      
      Byte bb = ...
      bb instanceof char      // error
      
      Integer ii = ...
      ii instanceof byte      // error
      ii instanceof short     // error
      
      Long ll = ...
      ll instanceof int       // error
      ll instanceof Float     // error
      ll instanceof Double    // error

      If the left-hand operand is of a reference type and its value is null, instanceof continues to evaluate to false.

      Exactness of casting conversions

      A conversion is exact if no loss of information occurs. Whether a conversion is exact depends on the pair of types involved and potentially on the input value:

      • For some pairs, the conversion from the first type to the second type is guaranteed not to lose information for any value and thus requires no action at run time. The conversion is said to be unconditionally exact.

      • For other pairs, a run-time test is needed to check whether the value can be converted from the first type to the second type without loss of information. Examples include long to int and int to float — both of these conversions detect loss of precision by relying on the notion of representation equivalence defined in the specification of the java.lang.Double class.

      A primitive conversion is unconditionally exact if it widens from one integral type to another, widens from one floating-point type to another, widens from byte, short, or char to a floating-point type, or widens int to double.

      In more detail, using the notation of JLS §5.5 the following table signifies the unconditionally exact primitive conversions with the symbol ɛ. For completeness,  means no conversion is allowed,  means the identity conversion, ω means a widening primitive conversion, η means a narrowing primitive conversion, and ωη means a widening and narrowing primitive conversion.

      To → byte short char int long float double boolean
      From ↓
      byte ɛ ωη ɛ ɛ ɛ ɛ
      short η η ɛ ɛ ɛ ɛ
      char η η ɛ ɛ ɛ ɛ
      int η η η ɛ ω ɛ
      long η η η η ω ω
      float η η η η η ɛ
      double η η η η η η
      boolean

      In the following examples the unconditionally exact conversions are marked with (ε). Those conversions always return true regardless of the value; all the others require a runtime test.

      byte b = 42;
      b instanceof int;         // true (ε)
      
      int i = 1000;
      i instanceof byte;        // false
      
      int i = 42;
      i instanceof byte;        // true
      
      int i = 16_777_217;       // 2^24+1
      i  instanceof float;      // false
      i  instanceof double;     // true (ε)
      i  instanceof Integer;    // true (ε)
      i  instanceof Number;     // true (ε)
      
      float f = 1000.0f;
      f instanceof byte;        // false
      f instanceof int;         // true
      f instanceof double;      // true (ε)
      
      double d = 1000.0d;
      d instanceof byte;        // false
      d instanceof int;         // true
      d instanceof float;       // true
      
      Integer ii = 1000;
      ii instanceof int;        // true
      ii instanceof float;      // true
      ii instanceof double;     // true
      
      Integer ii = 16_777_217;
      ii instanceof float;      // false
      ii instanceof double;     // true

      Primitive type patterns

      At present, type patterns allow primitive types only when they appear in a nested pattern list of a record pattern; they are not permitted in top-level contexts. We lift that restriction and then define the semantics of primitive type patterns, and of reference type patterns on targets of primitive type, in terms of safe casting conversions:

      • A type pattern T t is applicable to a target of type U if a U could be cast to T without an unchecked warning.

      • A type pattern T t is unconditional on a target of type U if all values of U can be exactly cast to T. This includes widening from one reference type to another, boxing, and any of the unconditionally exact primitive conversions defined above.

      • A set of patterns containing a type pattern T t is exhaustive on a target of type U if T t is unconditional on U or if there is an unboxing conversion from T to U.

      • A type pattern T t dominates a type pattern U u, or a record pattern U(...), if T t would be unconditional on a target of type U.

      • A type pattern T t that is not null-matching is said to match a target u if u instanceof T. The instanceof check ensures that the implied casting conversion would not result in loss of information or error.

      Exhaustiveness

      A switch expression requires that all statically-known possible values of the selector expression be handled in the switch block; in other words, the switch must be exhaustive. While a switch can be exhaustive if it contains an unconditional type pattern, it can be exhaustive in other situations as well, deferring any unhandled cases to run time. If a set of patterns is exhaustive for a type then the run-time values that are not matched by any pattern in the set are the remainder of the set. (For further detail, see Patterns: Exhaustiveness, Unconditionality, and Remainder.)

      With pattern labels involving record patterns, some patterns are considered to be exhaustive even when they are not unconditional. For example:

      Box<Box<String>> bbs = ...
      switch (bbs) {
          case Box(Box(String s)): ...
      }

      This switch is considered exhaustive on Box<Box<String>> even though the pattern Box(Box(String s)) will not match the pathological value new Box(null), which is in the remainder set and is handled by a synthetic default clause that throws MatchException.

      With the introduction of primitive type patterns, we observe that unboxing follows the same philosophy. For example:

      Box<Integer> bi = ...
      switch (bi) {
          case Box(int i): ...
      }

      This switch is considered exhaustive on Box<Integer> even though the pattern Box(int i) will not match the pathological value new Box(null), which is in the remainder set.

      Constant expressions in case labels

      The primitive types long, float, double, and boolean (and their corresponding boxed types Long, Float, Double, and Boolean) can now be used in type patterns in the case labels of a switch block as long as the type of the selector expression is either the same type or its corresponding boxed (or unboxed) type.

      Constants of the primitive types can be used in case labels as long as they have the same type as the selector expression (or its unboxed type). For example:

      switch (f) {
          case 0f -> 5f + 0f;
          case Float fi when fi == 1f -> 6f + fi;
          case Float fi -> 7f + fi;
      }

      Here the constant expression 0f can be used only when the type of the selector expression, f, is either float or Float.

      The semantics of floating-point constants in case labels is defined in terms of representation equivalence at both run time and compile time. It is a compile-time error to use two floating-point constants that are representationally equivalent. For example:

      float f = ...
      switch (f) {
          case 1.0f -> ...
          case 0.999999999f -> ...    // error: duplicate label
          default -> ...
      }

      While 1.0f is representable as a float, 0.999999999f is not. The latter is rounded up to 1.0f, thus creating a duplicate case label.

      Since the boolean type (and its corresponding boxed type) has only two distinct values, a switch that lists both the true and false cases is considered exhaustive:

      boolean b = ...
      switch (b) {
        case true -> ...
        case false -> ...
        // Alternatively: case true, false -> ...
      }

      It is a compile-time error for this switch to include a default clause.

      Attachments

        Activity

          People

            mr Mark Reinhold
            abimpoudis Angelos Bimpoudis
            Angelos Bimpoudis Angelos Bimpoudis
            Alex Buckley
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated: