Summary
Enhance the Java language with unnamed patterns, which match a record
component without stating the component's name or type, and with unnamed
variables, which can be initialized but not used. Both are denoted with an
underscore: _
.
Problem
Java developers use record patterns to
disaggregate a record instance into its components. In the following code, one
part of a program creates a ColoredPoint
instance, while another part of the
program uses pattern matching with instanceof
to test whether a variable is a
ColoredPoint
, and extract its two components if so:
record Point(int x, int y) {}
enum Color { RED, GREEN, BLUE }
record ColoredPoint(Point p, Color c) {}
... new ColoredPoint(new Point(3,4), Color.GREEN) ...
if (r instanceof ColoredPoint(Point p, Color c)) {
... p.x() ... p.y() ...
}
The code above needs only p
in the if
block, not c
, however today
developers have to spell out all the components of a record class every time
they perform pattern matching. Furthermore, it is not visually clear that the
Color
component is irrelevant. This is especially evident when record patterns
are nested to extract data within components, such as:
if (r instanceof ColoredPoint(Point(int x, int y), Color c)) {
... x ... y ...
}
As a result omitting unnecessary components such as Color c
in both of the
previous examples would be desirable for clearer code.
In some other occasions, developers may not need to initialize any pattern
variables during pattern matching but they will need to explore the shape of the
structure at runtime. As a highly simplified example, consider the following
Box
and Ball
classes, and a switch
that explores the content of a Box
:
record Box<T extends Ball>(T content) {}
sealed abstract class Ball permits RedBall, BlueBall, GreenBall {}
final class RedBall extends Ball {}
final class BlueBall extends Ball {}
final class GreenBall extends Ball {}
Box<? extends Ball> b = ...
switch (b) {
case Box(RedBall red) -> processBox(b);
case Box(BlueBall blue) -> processBox(b);
case Box(GreenBall green) -> stopProcessing();
}
Since the variables are unused it would be ideal if the developer could elide their names, while keeping the explicit type for shape analysis reasons.
Furthermore, if the switch
was hypothetically refactored to group the first two patterns in
one case
(something that is not allowed in Pattern Matching for Switch):
case Box(RedBall red), Box(BlueBall blue) -> processBox(b);
then it would be erroneous to name the components: Neither of the names is usable on the right-hand side because either of the patterns on the left-hand side could have matched. Since the names are unusable it would be ideal to elide them.
Turning to traditional imperative code, most developers will have encountered
the situation of having to declare a variable that they did not intend to use.
This typically occurs when the side effect of a statement is more important than
its result. For example, the following code uses an enhanced-for
statement to
step through a collection, calculating total
as a side effect, without using
the loop variable order
:
int total = 0;
for (Order order : orders) {
if (total < LIMIT) {
... total++ ...
}
}
The prominence of order
's declaration is unfortunate given that order
is not
used. Here is another example where the side effect of a expression is more
important than its result, leading to an unused variable. The following code
dequeues data but only needs two out of every three elements:
Queue<Integer> q = ... // x1, y1, z1, x2, y2, z2 ..
while (q.size()>=3) {
int x = q.remove();
int y = q.remove();
int z = q.remove(); // z is unused
... new Point(x, y) ...
}
The third call to remove()
has the desired side effect -- dequeuing an element
-- regardless of whether its result is assigned to a variable, so the
declaration of z
could be elided--while satisfying the desire to show that
remove
indeed could returns a value.
Unused variables occur frequently in two other kinds of statement that focus on side effects:
- The
try
-with-resources statement is always used for its side effect: the automatic closing of resources. For example the following code acquires and (automatically) releases a context; the nameacquiredContext
is merely clutter:
try (var acquiredContext = ScopedContext.acquire()) {
... acquiredContext not used ...
}
- Exceptions are the ultimate side effect, and handling one often gives rise to
an unused variable. For example, most Java developers will have written
catch
blocks as shown below, where the name of the exception parameter is irrelevant:
String s = ...;
try {
int i = Integer.parseInt(s);
... i ...
} catch (NumberFormatException ex) {
System.out.println("Bad number: " + s);
}
Even code without side effects is sometimes forced to declare unused variables.
For example, the following code generates a map where each key mapped to the
same placeholder value; since the lambda parameter v
is not used, its name is
irrelevant:
...stream.collect(Collectors.toMap(String::toUpperCase, v -> "NODATA"));
In all these scenarios where variables are unused and their names are irrelevant, it would be ideal if developers could declare variables with no name.
Solution
The Java language is enhanced as follows:
- Allow the underscore
_
to denote an unnamed pattern in place of a whole type pattern or record pattern. - Allow the underscore
_
to denote an unnamed pattern variable in a type pattern. Allow the underscore
_
to denote an unnamed variable when either the local variable in a local variable declaration statement, or an exception parameter in a catch clause, or a lambda parameter in a lambda expression, are unused. The following kinds of declaration can introduce either a named variable (denoted by an identifier) or an unnamed variable (denoted by an underscore):- a local variable declaration statement in a block (JLS 14.4.2)
- a resource specification of a try-with-resources statement (JLS 14.20.3)
- the header of a basic for statement (JLS 14.14.1)
- the header of an enhanced for loop (JLS 14.14.2)
- an exception parameter of a catch block (JLS 14.20)
- a formal parameter of a lambda expression (JLS 15.27.1)
- Allow unnamed pattern variables in a switch that needs to execute the same action for multiple cases. The grammar of switch labels is enhanced to allow multiple patterns. Those are semantically correct only when unnamed pattern variables are used in all pattern cases and no binding variables are introduced.
- Neither the unnamed pattern nor
var _
may be used at the top level of a pattern: both... instanceof _
and... instanceof var _
are prohibited, as arecase _
andcase var _
. - The linter for TWR + underscore needs to mute the lint warning for
_
not being referenced. This is not applicable anymore for unnamed variables. - Update the javax.lang.model for unnamed variables. Tracked in a separate CSR: 8307577: Implementation for javax.lang.model for unnamed variables (Preview).
Specification
The updated JLS draft for unnamed patterns and variables is attached as jep443-20230322.zip. Also in https://cr.openjdk.org/~abimpoudis/unnamed/jep443-20230322/specs/unnamed-jls.html
The proposed API enhancements are attached as specdiff.preliminary.00.zip. Those will mostly reflect the introduction of a new tree kind to support an AnyPatternTree
. Changes in javax.lang.model are included in 8307577: Implementation for javax.lang.model for unnamed variables (Preview).
The changes to the specification and API are a subject of change until the CSR is finalized.
- csr of
-
JDK-8302344 Compiler Implementation for Unnamed patterns and variables (Preview)
- Resolved
- relates to
-
JDK-8307444 java.lang.AssertionError when using unnamed patterns
- Resolved
-
JDK-8315851 Compiler Implementation for Unnamed Variables & Patterns
- Closed