Details
-
JEP
-
Status: Submitted
-
P2
-
Resolution: Unresolved
-
None
-
None
-
Angelos Bimpoudis
-
Feature
-
Open
-
SE
-
-
M
-
M
Description
Summary
Enhance pattern matching by allowing primitive type patterns to be used in all
pattern contexts, align the semantics of primitive type patterns with that of
instanceof
, and extend switch
to allow primitive constants as case
labels. This is a preview language feature.
Goals
Enable uniform data exploration by allowing type patterns to match values of any type, whether primitive or reference.
Align primitive type patterns with safe casting.
Allow pattern matching to use primitive type patterns in both nested and top-level contexts.
Provide easy-to-use constructs that eliminate the risk of losing information due to unsafe casts.
Following the enhancements to
switch
in Java 5 (enumswitch
) and Java 7 (stringswitch
), allowswitch
to process values of any primitive type.
Non-Goals
- It is not a goal to introduce new types of conversions or new conversion contexts.
Motivation
Records and record patterns work together to streamline data processing. Records (JEP 395) make it easy to aggregate components, and record patterns (JEP 440) make it easy to decompose aggregates using pattern matching.
In this example, we represent JSON documents with a sealed hierarchy of records:
sealed interface JsonValue {
record JsonString(String s) implements JsonValue { }
record JsonNumber(double d) implements JsonValue { }
record JsonNull() implements JsonValue { }
record JsonBoolean(boolean b) implements JsonValue { }
record JsonArray(List<JsonValue> values) implements JsonValue { }
record JsonObject(Map<String, JsonValue> map) implements JsonValue { }
}
JSON does not distinguish integers from non-integers, so in JsonNumber
we
represent all numbers with double
values as recommended by the specification.
Given a JSON payload of
{ "name" : "John", "age" : 30 }
we can construct a corresponding JsonValue
via
var json = new JsonObject(Map.of("name", new JsonString("John")
"age", new JsonNumber(30)));
For each key in the map, this code instantiates an appropriate record for the
corresponding value. For the first, the value "John"
has the same type as the
record's component, namely String
. For the second, however, the Java compiler
applies a widening primitive conversion to convert the int
value, 30, to a
double
.
Nested primitive type patterns are limited
We can, of course, use record patterns to disaggregate this JSON value:
record Customer(String name, int age) { }
if (json instanceof JsonObject(var map)
&& map.get("name") instanceof JsonString(String name)
&& map.get("age") instanceof JsonNumber(double age))
{
return new Customer(name, (int)age); // unavoidable cast
}
Here we see that primitive type patterns in nested contexts have a limitation:
In this application we expect the age
value always to be an int
, but from
the JsonNumber
pattern we can only extract a double
and must rely upon a
lossy manual cast to convert that to an int
. We should return a Customer
object only when the age
value is representable as an int
, which requires
additional code:
if (json instanceof JsonObject(var map)
&& map.get("name") instanceof JsonString(String name)
&& map.get("age") instanceof JsonNumber(double age))
{
int age2 = (int)age; // unavoidable cast
if (age2 == age)
return new Customer(name, age2);
}
When we constructed the JsonObject
, we were able to pass an int
to the
constructor where it expected double
. But, when we disaggregate the
JsonObject
with a record pattern, we have to bind a double
value, and then
manually cast it to int
. What we would really like to do is use int
directly
in the JsonNumber
pattern so that the pattern matches only when the double
value inside the JsonNumber
object can be converted to an int
without loss
of information, and when it does match it automatically narrows the double
value to an int
:
if (json instanceof JsonObject(var map)
&& map.get("name") instanceof JsonString(String name)
&& map.get("age") instanceof JsonNumber(int age))
{
return new Customer(name, age); // no cast!
}
This sort of usage is characteristic of pattern matching's ability to reject illegal values automatically. Pattern matching eliminates the need for verbose and potentially unsafe casts by raising match failures to control-flow decisions.
Unfortunately the above example does not work today because primitive type
patterns are invariant: The type of the component being matched must be
identical to the type in the primitive type pattern. Thus we cannot write
JsonNumber(int age)
because the component of the JsonNumber
record class was
declared to be a double
. That is not the case for reference types; for
example:
record Box(Object o) { }
Box b = new Box(new RedBall());
if (b instanceof Box(RedBall r)) { ... }
Here the pattern Box(RedBall r)
matches only when b
is a Box
that holds a
RedBall
, in which case it binds the local variable r
of type RedBall
to
that object. This all works even though the component of the Box
record class
was declared to be an Object
rather than a RedBall
.
Primitive type patterns should not mean something different from reference type patterns; they should both mean that the value can be cast safely, without loss of information, and they should both allow matching and binding to a type other than the record's original component type when sensible.
Primitive type patterns are not permitted in top-level contexts
Primitive type patterns are useful outside of record patterns, but at present they are not permitted in the top levels of patterns.
For example, suppose we want to process an int
value only when it can be cast
safely to a byte
. Today we can write an explicit range test:
void processByte(byte b) { ... }
if (i >= -128 && i <= 127) {
processByte((byte)i);
}
Alternatively, we could use round-trip casts:
if ((int)(byte)i == i) {
processByte((byte)i);
}
If we could use primitive types in top-level patterns then we could instead write simply:
if (i instanceof byte b) {
processByte(b);
}
Here the instanceof
operator guarantees that, when it returns true
, the
int
value i
has been cast safely to a byte
and bound to the variable b
without any loss of information, i.e., sign or magnitude. In other words, it
acts as a safeguard.
Top-level primitive type patterns could also be used to guard against lossy assignments, which have been a silent risk since Java 1.0. For example, we could rewrite
int getPopulation() { ... }
float pop = getPopulation(); // possible loss of information!
more safely as
if (getPopulation() instanceof float pop) {
... pop ...
}
Primitive types are not permitted in type comparisons
Sometimes we only need to use instanceof
as a type comparison operator,
without binding a pattern variable. For example, today we can write
if (o instanceof String) {
... (String)o ...
}
to test that the object denoted by o
is indeed a String
, so that the cast
(String)o
will never fail and throw a ClassCastException
. We should,
similarly, be able to write
if (i instanceof byte) {
... (byte)b ...
}
to test that the value denoted by i
can be represented as a byte
, so that
the cast (byte)b
will never fail. When casting to a primitive type, failure
does not mean throwing an exception but, rather, losing information; if, e.g.,
the value of i
were 1000
(0x3e8
) then (byte)i
would evaluate to -24
(0xe8
). It would be ideal, as a result, to make the meaning of instanceof
byte
consistent with the pattern match instanceof byte b
.
More generally, type comparisons using instanceof
should work on all types,
recognizing the deep connection between instanceof
and casting. An
instanceof
test should succeed only when a casting conversion exists from the
type of the left-hand operand to the type on the right-hand side, and only when
that conversion can be performed without loss of information, i.e., sign,
magnitude, or precision.
Primitive type patterns in switch
At present, primitive type patterns are not allowed in the top-level context of
the instanceof
operator, nor are they allowed in the top-level contexts of the
case
labels of a switch
. For example, with a top-level primitive type
pattern we could rewrite the switch
expression
switch (x.getStatus()) {
case 0 -> "okay";
case 1 -> "warning";
case 2 -> "error";
default -> "unknown status: " + x.getStatus();
}
more clearly as
switch (x.getStatus()) {
case 0 -> "okay";
case 1 -> "warning";
case 2 -> "error";
case int i -> "unknown status: " + i;
}
Here the case int i
label matches any status value not previously matched,
making the switch
expression exhaustive so that no default
label is
required.
Permitting top-level primitive type patterns would allow guards to be used to
further restrict the values matched by case
labels:
switch (x.getYearlyFlights()) {
case 0 -> ...;
case 1 -> ...;
case 5 -> issueDiscount();
case int i when i > 100 -> issueGoldCard();
case int i -> ...;
}
Combining primitive type patterns and record patterns facilitates further opportunities for case analysis when combined with record patterns:
switch (x.order()) {
case NormalOrder(Product(int productCode)) -> ...;
case BadOrder x -> switch (x.reason()) {
case MissingProduct q -> switch (q.code()) {
case 1 -> ...;
case 2 -> ...;
case int i -> ...;
}
}
}
switch
does not support all primitive types
At present, switch
expressions and switch
statements can switch on values of
the primitive types byte
, short
, char
, and int
— but not boolean
,
float
, double
, or long
. We can switch on a long
value only when it fits
within an int
, so we must handle any remaining cases with if
statements:
long v = ...;
if (v == (int)v) {
switch ((int)v) {
case 0x01 -> ...;
case 0x02 -> ...;
case int i -> ... i ...;
}
}
if (v == 10_000_000_000L) { ... }
if (v == 20_000_000_000L) { ... }
If we could use long
constant expressions in case
labels then we could
instead write:
long v = ...;
switch (v) {
case 0x01 -> ...;
case 0x02 -> ...;
case 10_000_000_000L -> ...;
case 20_000_000_000L -> ...;
case long l -> ... l ...;
}
Similarly, consider code that uses if
-else
chains to test float
values:
float f = ...;
if (Float.isNaN(f)) {
...
} else if (Float.isInfinite(f)) {
...
} else {
...
}
With float
values in case
labels we could declutter this into:
float f = ...;
switch (f) {
case Float.NaN -> ...;
case Float.POSITIVE_INFINITY -> ...;
case Float.NEGATIVE_INFINITY -> ...;
case float g -> ... g ...;
}
Switching on boolean
values could be a useful alternative to the ternary
conditional operator (?
/:
). Unlike that operator, a boolean
switch
expression can contain both expressions and statements in its rules. For
example:
startProcessing(OrderStatus.NEW, switch (user.isLoggedIn()) {
case true -> user.id();
case false -> { log("Unrecognized user"); yield -1; }
});
Here the second argument to the startProcessing
method uses a boolean
switch
to encapsulate some business logic.
When switching on a primitive value, a switch
expression or statement should
automatically convert between the type of that value and the types of its case
labels — as long as those conversions do not lose information. For example,
when switching on a float
value the case
labels could be of type float
,
double
, int
, or long
as long as the constant value of each label converts
sensibly to a float
.
float f = ...;
switch (f) {
case 16_777_216 -> ...;
case 16_777_217 -> ...; // error: duplicate label
default -> ...;
}
This switch
accepts a float
but its case labels are integral values that
convert to the same float
value. The cases are indistinguishable at run time,
so this code should be rejected at compile time.
In summary, primitive types in instanceof
, and in type patterns for
instanceof
and switch
, would increase program reliability and enable more
uniform data exploration with pattern matching.
Description
We propose to define the semantics of all type patterns in terms of the
instanceof
type comparison operator. Henceforth, instanceof
will not only be
able to test and compare reference types but also safeguard any casting
conversion. As a result, instanceof
will be able to test whether a value can
be cast safely to a target type, i.e., without throwing a
ClassCastException
, without throwing a NullPointerException
, and, in the
case of primitive types, without losing information (sign, magnitude, or
precision).
Having done that, we then need only to lift the remaining restrictions on
primitive types in type patterns and in switch
blocks in order to achieve our
goals.
instanceof
as the precondition test for safe casting conversions
As of Java 16, the instanceof
operator is either a type comparison operator
(e.g., o instanceof String
) or a pattern match operator (e.g., o instanceof
String s
), depending on its syntactic form.
To enable primitive types to be used with the instanceof
type comparison
operator we remove the restrictions that (1) the type of the left-hand operand
must be a reference type, and (2) the right-hand operand must name a reference
type. The form of type comparison expressions thus becomes:
InstanceofExpression:
RelationalExpression instanceof Type
...
At present, the result of a type comparison expression x instanceof T
is
false
if x
denotes the null
reference, true
if x
can be cast to the
reference type T
without raising a ClassCastException
, and false
otherwise. We generalize the semantics of such expressions to test whether x
can be converted exactly to the given primitive or reference type T
in a
casting context (JLS §5.5)
without loss of information. It remains a compile-time error to use
instanceof
if no cast conversion exists from the static type of x
to the
type T
. Under this generalization, the instanceof
type comparison operator
works for all pairs of types that can be converted in a casting context.
The examples given earlier rely on conversions allowed in a casting context, so
they can be rewritten to use instanceof
directly:
int i = 1000;
if (i instanceof byte) { // false
byte b = (byte)i;
... b ...
}
byte b = 42;
if (b instanceof int) { // true
int i = (byte)b;
... i ...
}
int i = 16_777_216; // 2^24
if (i instanceof float) { // true
float f = (float)i;
... f ...
}
int i = 16_777_217; // 2^24+1
if (i instanceof float) { // false
float f = (float)i;
... f ...
}
We do not add any new conversions to casting contexts, nor do we create any new
conversion contexts. Whether instanceof
is applicable to a given expression
and type is determined solely by whether a conversion is allowed by the casting
context. The conversions (JLS §5.1)
permitted in casting contexts are:
- Identity conversions,
- Widening primitive conversions,
- Narrowing primitive conversions,
- Widening and narrowing primitive conversions,
- Boxing conversions, and
- Unboxing conversions
as well as specified combinations of:
- An identity conversion,
- A widening reference conversion,
- A widening reference conversion followed by an unboxing conversion,
- A widening reference conversion followed by an unboxing conversion and then a widening primitive conversion,
- A narrowing reference conversion,
- A narrowing reference conversion followed by an unboxing conversion,
- An unboxing conversion, and
- An unboxing conversion followed by a widening primitive conversion.
Consider the following examples. All of these are allowed because the left-hand
operand of the instanceof
operator can be converted, in a casting context, to
the type specified by the right-hand operand:
int i = ...
i instanceof byte
i instanceof float
boolean b = ...
b instanceof Boolean
Short s = ...
s instanceof int
s instanceof long
long l = ...
l instanceof float
l instanceof double
Long ll = ...
ll instanceof float
ll instanceof double
However, all of the following examples raise a compile-time error, since they do not correspond to an existing casting conversion:
boolean b = ...
b instanceof char // error
Byte bb = ...
bb instanceof char // error
Integer ii = ...
ii instanceof byte // error
ii instanceof short // error
Long ll = ...
ll instanceof int // error
ll instanceof Float // error
ll instanceof Double // error
If the left-hand operand is of a reference type and its value is null
,
instanceof
continues to evaluate to false
.
Exactness of casting conversions
A conversion is exact if no loss of information occurs. Whether a conversion is exact depends on the pair of types involved and potentially on the input value:
For some pairs, the conversion from the first type to the second type is guaranteed not to lose information for any value and thus requires no action at run time. The conversion is said to be unconditionally exact.
For other pairs, a run-time test is needed to check whether the value can be converted from the first type to the second type without loss of information. Examples include
long
toint
andint
tofloat
— both of these conversions detect loss of precision by relying on the notion of representation equivalence defined in the specification of thejava.lang.Double
class.
A primitive conversion is unconditionally exact if it widens from one integral
type to another, widens from one floating-point type to another, widens from
byte
, short
, or char
to a floating-point type, or widens int
to
double
.
In more detail, using the notation of JLS §5.5
the following table signifies the unconditionally exact primitive conversions
with the symbol ɛ
. For completeness, —
means no conversion is
allowed, ≈
means the identity conversion, ω
means a widening
primitive conversion, η
means a narrowing primitive conversion, and
ωη
means a widening and narrowing primitive conversion.
To → | byte | short | char | int | long | float | double | boolean |
---|---|---|---|---|---|---|---|---|
From ↓ | ||||||||
byte | ≈ | ɛ | ωη | ɛ | ɛ | ɛ | ɛ | — |
short | η | ≈ | η | ɛ | ɛ | ɛ | ɛ | — |
char | η | η | ≈ | ɛ | ɛ | ɛ | ɛ | — |
int | η | η | η | ≈ | ɛ | ω | ɛ | — |
long | η | η | η | η | ≈ | ω | ω | — |
float | η | η | η | η | η | ≈ | ɛ | — |
double | η | η | η | η | η | η | ≈ | — |
boolean | — | — | — | — | — | — | — | ≈ |
In the following examples the unconditionally exact conversions are marked
with (ε). Those conversions always return true
regardless of the value;
all the others require a runtime test.
byte b = 42;
b instanceof int; // true (ε)
int i = 1000;
i instanceof byte; // false
int i = 42;
i instanceof byte; // true
int i = 16_777_217; // 2^24+1
i instanceof float; // false
i instanceof double; // true (ε)
i instanceof Integer; // true (ε)
i instanceof Number; // true (ε)
float f = 1000.0f;
f instanceof byte; // false
f instanceof int; // true
f instanceof double; // true (ε)
double d = 1000.0d;
d instanceof byte; // false
d instanceof int; // true
d instanceof float; // true
Integer ii = 1000;
ii instanceof int; // true
ii instanceof float; // true
ii instanceof double; // true
Integer ii = 16_777_217;
ii instanceof float; // false
ii instanceof double; // true
Primitive type patterns
At present, type patterns allow primitive types only when they appear in a nested pattern list of a record pattern; they are not permitted in top-level contexts. We lift that restriction and then define the semantics of primitive type patterns, and of reference type patterns on targets of primitive type, in terms of safe casting conversions:
A type pattern
T t
is applicable to a target of typeU
if aU
could be cast toT
without an unchecked warning.A type pattern
T t
is unconditional on a target of typeU
if all values ofU
can be exactly cast toT
. This includes widening from one reference type to another, boxing, and any of the unconditionally exact primitive conversions defined above.A set of patterns containing a type pattern
T t
is exhaustive on a target of typeU
ifT t
is unconditional onU
or if there is an unboxing conversion fromT
toU
.A type pattern
T t
dominates a type patternU u
, or a record patternU(...)
, ifT t
would be unconditional on a target of typeU
.A type pattern
T t
that is not null-matching is said to match a targetu
ifu instanceof T
. Theinstanceof
check ensures that the implied casting conversion would not result in loss of information or error.
Exhaustiveness
A switch
expression requires that all statically-known possible values of the
selector expression be handled in the switch
block; in other words, the switch
must be exhaustive. While a switch
can be exhaustive if it contains an
unconditional type pattern, it can be exhaustive in other situations as well,
deferring any unhandled cases to run time. If a set of patterns is exhaustive
for a type then the run-time values that are not matched by any pattern in the
set are the remainder of the set. (For further detail, see
Patterns: Exhaustiveness, Unconditionality, and Remainder.)
With pattern labels involving record patterns, some patterns are considered to be exhaustive even when they are not unconditional. For example:
Box<Box<String>> bbs = ...
switch (bbs) {
case Box(Box(String s)): ...
}
This switch
is considered exhaustive on Box<Box<String>>
even though the
pattern Box(Box(String s))
will not match the pathological value new
Box(null)
, which is in the remainder set and is handled by a synthetic
default
clause that throws MatchException
.
With the introduction of primitive type patterns, we observe that unboxing follows the same philosophy. For example:
Box<Integer> bi = ...
switch (bi) {
case Box(int i): ...
}
This switch
is considered exhaustive on Box<Integer>
even though the pattern
Box(int i)
will not match the pathological value new Box(null)
, which is in
the remainder set.
Constant expressions in case
labels
The primitive types long
, float
, double
, and boolean
(and their
corresponding boxed types Long
, Float
, Double
, and Boolean
) can now be
used in type patterns in the case
labels of a switch
block as long as the
type of the selector expression is either the same type or its corresponding
boxed (or unboxed) type.
Constants of the primitive types can be used in case
labels as long as they
have the same type as the selector expression (or its unboxed type). For
example:
switch (f) {
case 0f -> 5f + 0f;
case Float fi when fi == 1f -> 6f + fi;
case Float fi -> 7f + fi;
}
Here the constant expression 0f
can be used only when the type of the selector
expression, f
, is either float
or Float
.
The semantics of floating-point constants in case
labels is defined in terms
of representation equivalence at both run time and compile time. It is a
compile-time error to use two floating-point constants that are
representationally equivalent. For example:
float f = ...
switch (f) {
case 1.0f -> ...
case 0.999999999f -> ... // error: duplicate label
default -> ...
}
While 1.0f
is representable as a float
, 0.999999999f
is not. The latter is
rounded up to 1.0f
, thus creating a duplicate case
label.
Since the boolean
type (and its corresponding boxed type) has only two
distinct values, a switch
that lists both the true
and false
cases is
considered exhaustive:
boolean b = ...
switch (b) {
case true -> ...
case false -> ...
// Alternatively: case true, false -> ...
}
It is a compile-time error for this switch
to include a default
clause.