-
JEP
-
Resolution: Unresolved
-
P3
-
None
-
None
-
Feature
-
Open
-
SE
-
-
XL
-
XL
-
401
Summary
Enhance the Java Platform with value objects: class instances that have
only final
fields and lack object identity.
This is a preview language and VM feature.
Goals
Allow developers to opt in to a programming model for domain values in which objects are distinguished solely by the values of their fields, much as the
int
value3
is distinguished from theint
value4
.Support compatible migration of existing classes that represent domain values to this programming model. Migrate suitable classes in the JDK, such as
Integer
andLocalDate
, to be value classes.Maximize the freedom of the JVM to store domain values in ways that improve memory footprint, locality, and garbage collection efficiency.
Non-Goals
It is not a goal to automatically treat existing classes as value classes, even if they share some characteristics of value classes. Value objects do not uniformly work the same way as other objects, so class authors must explicitly choose to declare value classes.
It is not a goal to "fix" the
==
operator so that programmers can use it in place ofequals
. This JEP redefines==
only as much as necessary to cope with a new kind of identity-free object. The usual advice to compare objects in most contexts using theequals
method still applies.It is not a goal to introduce a
struct
feature in the Java language. Java programmers are not asked to understand new semantics for memory management or variable storage. Java continues to operate on just two kinds of data: primitives and object references.It is not a goal to change the treatment of primitive types. Primitive types behave like value classes in many ways, but are a distinct concept. A separate JEP will provide enhancements to make primitive types more class-like and compatible with generics.
It is not a goal to guarantee any particular optimization strategy or memory layout. This JEP enables many potential optimizations; only some will be implemented initially. Some optimizations, such as layouts that exclude null, will only be possible after future language and JVM enhancements.
Motivation
Java developers often need to represent domain values: the date of an event, the
color of a pixel, the shipping address of an order, and so on. Developers
usually model these values with immutable classes that contain just enough
business logic to construct, validate, and transform instances. The
toString
, equals
, and hashCode
methods in these classes are defined
so that equivalent instances can be used interchangeably.
As an example, event dates can be represented with the JDK's <code class="prettyprint" data-shared-secret="1757189481785-0.6354758317150114">LocalDate</code> class:
jshell> LocalDate d1 = LocalDate.of(1996, 1, 23)
d1 ==> 1996-01-23
jshell> LocalDate d2 = d1.plusYears(30)
d2 ==> 2026-01-23
jshell> LocalDate d3 = d2.minusYears(30)
d3 ==> 1996-01-23
jshell> d1.equals(d3)
$4 ==> true
Developers will regard the "essence" of a LocalDate
object as its year, month,
and day values. But to Java, the essence of any object is its identity.
Each time the of
method in LocalDate
invokes new LocalDate(...)
, an
object with a unique identity is allocated, distinguishable from every other
object in the system.
The easiest way to observe the identity of an object is with the ==
operator:
jshell> d1 == d3
$6 ==> false
Even though d1
and d3
represent the same year-month-day triple
(d1.equals(d3)
is true
), they are two objects with distinct identities.
For mutable objects, identity is important: it lets us distinguish two objects
that have the same state now but will have different state in the future. For
example, suppose a class Customer
has a field lastOrderedDate
that is
mutated when the customer makes a new order. Two Customer
objects might have
the same lastOrderedDate
, but it would be a coincidence; when one of the
customers makes a new order, the application will mutate the lastOrderedDate
of one Customer
object but not the other, relying on identity to pick the
right one.
In other words, when objects are mutable, they are not interchangeable. But most
domain values are not mutable and are interchangeable. There is no practical
difference between two LocalDate
objects representing 1996-01-23
, because
their state is fixed and unchanging. They represent the same domain value, both
now and in the future. There is no need to distinguish the two objects via their
identities.
In fact, object identity is actively confusing when objects are immutable and
are meant to be interchangeable. Most developers will recall the experience of
unwittingly using ==
to compare objects, as in d1 == d3
above, and being
mystified by a false
result even though the objects' state and behavior seem
identical.
The JDK tries to reduce confusion for the immutable classes that model primitive
values, such as Integer
. In particular, the autoboxing of small int
values
to Integer
uses a cache to avoid creating Integer
objects with unique
identities. However, this cache, somewhat arbitrarily, does not extend to
four-digit int
values like 1996
:
jshell> Integer i = 96, j = 96;
i ==> 96
j ==> 96
jshell> i == j
$3 ==> true
jshell> Integer x = 1996, y = 1996;
x ==> 1996
y ==> 1996
jshell> x == y
$6 ==> false
For domain values like Integer
, the fact that each object has unique identity
is unwanted complexity that leads to surprising behavior and exposes
incidental implementation choices. This extra complexity could be avoided if
objects whose state and behavior make them interchangeable could be freed from
the legacy requirement to have distinct identities.
Java's requirement that every object have identity, even if some domain values don't want it, is a performance impediment. It means the JVM has to allocate memory for each newly created object, distinguishing it from every object already in the system, and reference the location in memory whenever the object is used or stored.
For example, suppose a program creates arrays of int
values and LocalDate
references:
jshell> int[] ints = { 1996, 2006, 1996, 1, 23 }
ints ==> int[5] { 1996, 2006, 1996, 1, 23 }
jshell> LocalDate[] dates = { d1, d1, d2, null, d3 }
dates ==> LocalDate[5] { 1996-01-23, 1996-01-23, 2026-01-23,
null, 1996-01-23 }
The int
array can be allocated by the JVM as a simple block of memory:
+----------+
| int[5] |
+----------+
| 1996 |
| 2006 |
| 1996 |
| 1 |
| 23 |
+----------+
In contrast, the LocalDate
array must be represented as a sequence of pointers,
each referencing a location in memory where an object has been allocated:
+--------------+
| LocalDate[5] |
+--------------+
| 87fa1a09 | -----------------------> +-----------+
| 87fa1a09 | -----------------------> | LocalDate |
| 87fb4ad2 | ------> +-----------+ +-----------+
| 00000000 | | LocalDate | | y=1996 |
| 87fb5366 | --- +-----------+ | m=1 |
+--------------+ | | y=2026 | | d=23 |
v | m=1 | +-----------+
+-----------+ | d=23 |
| LocalDate | +-----------+
+-----------+
| y=1996 |
| m=1 |
| d=23 |
+-----------+
Even though the data modeled by the LocalDate
array is not
significantly more complex than the int
array—a year-month-day
triple is effectively 48 bits of primitive data—the memory footprint is far
greater because of the pointers and allocated objects. dates[4]
has to
point to a different object than dates[0]
and dates[1]
, even though
all three elements represent the same year-month-day triple.
Worse, when a program iterates over the LocalDate
array, each pointer may need
to be dereferenced. CPUs use caches to enable fast access to chunks of memory;
if the array exhibits poor
memory locality
(a distinct possibility if the LocalDate
objects were allocated at different
times or out of order), every dereference may require caching a different
chunk of memory, frustrating performance.
In some application domains, developers program for speed by creating
as few objects as possible, thus de-stressing the garbage collector and
improving locality. For example, they might encode event dates with an int
representing an epoch day. Unfortunately, this approach gives
up the functionality of classes that makes Java code so maintainable:
meaningful names, private state, data validation by constructors, convenience
methods, etc. A developer operating on dates represented as int
values might
accidentally interpret the value relative to a start date in
1601 or 1980
rather than the intended 1970 start date.
Trillions of Java objects are created every day, each one bearing a unique
identity. We believe the time has come to let Java developers choose which
objects in the program need identity, and which do not. An immutable class like
LocalDate
that represents domain values could opt out of identity, so that it
would be impossible to distinguish between two LocalDate
objects representing
the date 1996-01-23
, just as it is impossible to distinguish between two int
values representing the number 4
.
By opting out of identity, developers are opting in to a programming model that provides the best of both worlds: the abstraction of classes with the simplicity and performance benefits of primitives.
In the future, this programming model will support new Java Platform APIs, such as classes that encode different kinds of integers and floating-point values, and new Java language features, such as user-defined conversions and mathematical operators for domain values.
Description
Java NN introduces value objects to model immutable domain values. A value
object is an instance of a value class, declared with the value
modifier.
Classes without the value
modifier are called identity classes, and their
instances are identity objects.
Java programs manipulate objects through references. A reference to an object
is stored in a variable and lets us find the object's fields.
Traditionally, a reference also encodes the unique identity of an object: each
execution of new
allocates a fresh object and returns a unique reference,
which can then be stored in multiple variables (aliasing).
And, traditionally, the ==
operator compares objects by comparing references,
so references to two objects are not ==
even if the objects have identical
field values.
In Java NN, value objects are different. A reference to a value object is stored
in a variable and lets us find the object's fields, but it does not serve as the
unique identity of the object. For a value class, executing new
might not
allocate a fresh object and might instead return a reference to an existing
object, or even a "reference" that embodies the object directly. The ==
operator compares value objects by comparing their field values, so references
to two objects are ==
if the objects have identical field values.
Developers can save memory and improve performance by using value objects for
immutable data. Because programs cannot tell the difference between two value
objects with identical field values (not even with ==
), the Java Virtual
Machine is able to change how a value object is laid out in memory without
affecting the program; for example, its fields could be stored on the stack
rather than the heap.
The following sections explore how value objects differ from identity objects and illustrate how to declare value classes. This is followed by an in-depth treatment of the special behaviors of value objects, considerations for value class declarations, and the JVM's handling of value classes and objects.
Enabling preview features
Value classes and objects are a preview language feature, disabled by default.
To try the examples below in JDK NN you must enable preview features:
Compile the program with
javac --release NN --enable-preview Main.java
and run it withjava --enable-preview Main
; or,When using the source code launcher, run the program with
java --enable-preview Main.java
; or,When using jshell, start it with
jshell --enable-preview
.
Some classes in the Java Platform API become value classes only if preview features are enabled; otherwise, they behave just as they did in JDK NN-1.
Programming with value objects
30 classes in java.*
are declared as value classes. They include:
- In
java.lang
:Integer
,Long
,Float
,Double
,Byte
,Short
,Character
,Boolean
- In
java.util
:Optional
,OptionalInt
,OptionalLong
,OptionalDouble
- In
java.time
:LocalDate
,LocalTime
,LocalDateTime
,ZonedDateTime
,Duration
All instances of these classes are value objects. This includes the
boxed primitives that are instances of Integer
, Long
,
etc. The ==
operator compares value objects by their field values, so, e.g.,
Integer
objects are ==
if they box the same primitive values:
% -> jshell --enable-preview
| Welcome to JShell -- Version 25-internal
| For an introduction type: /help intro
jshell> Integer x = 1996, y = 1996;
x ==> 1996
y ==> 1996
jshell> x == y
$3 ==> true
Similarly, two LocalDate
objects are ==
if they have the same year, month,
and day values:
jshell> LocalDate d1 = LocalDate.of(1996, 1, 23)
d1 ==> 1996-01-23
jshell> LocalDate d2 = d1.plusYears(30)
d2 ==> 2026-01-23
jshell> LocalDate d3 = d2.minusYears(30)
d3 ==> 1996-01-23
jshell> d1 == d3
$7 ==> true
The String
class has not been made a value class. Instances of String
are
always identity objects. We can use the Objects.hasIdentity
method, new in
JDK NN, to observe whether an object is an identity object.
jshell> String s = "abcd"
s ==> "abcd"
jshell> Objects.hasIdentity(s)
$9 ==> true
jshell> Objects.hasIdentity(d1)
$10 ==> false
jshell> String t = "aabcd".substring(1)
t ==> "abcd"
jshell> s == t
$13 ==> false
In most respects, value objects work the way that objects have always worked in Java. However, a few identity-sensitive operations, such as synchronization, are not supported by value objects.
jshell> synchronized (d1) { d1.notify(); }
| Error:
| unexpected type
| required: a type with identity
| found: java.time.LocalDate
| synchronized (d1) { d1.notify(); }
| ^--------------------------------^
jshell> Object o = d1
o ==> 1996-01-23
jshell> synchronized (o) { o.notify(); }
| Exception java.lang.IdentityException: Cannot synchronize on
an instance of value class java.time.LocalDate
| at (#19:1)
The JVM has a lot of freedom to encode references to value objects at run time in ways that optimize memory footprint, locality, and garbage collection efficiency. For example, we saw the following array earlier, implemented with pointers to heap objects:
jshell> LocalDate[] dates = { d1, d1, d2, null, d3 }
dates ==> LocalDate[5] { 1996-01-23, 1996-01-23, 2026-01-23,
null, 1996-01-23 }
Now that LocalDate
objects lack identity, the JVM could implement the array
using "references" that encode the fields of each LocalDate
directly. Each
array element can be represented as a 64-bit word that indicates whether the
reference is null, and if not, directly stores the year, month, and day field
values of the value object:
+--------------+
| LocalDate[5] |
+--------------+
| 1|1996|01|23 |
| 1|1996|01|23 |
| 1|2026|01|23 |
| 0|0000|00|00 |
| 1|1996|01|23 |
+--------------+
The performance characteristics of this LocalDate
array may be similar to
those of an ordinary int
array:
+----------+
| int[5] |
+----------+
| 1996 |
| 2006 |
| 1996 |
| 1 |
| 23 |
+----------+
This optimization is just one example; some value classes, like LocalDateTime
,
are too large to take advantage of this particular technique. Still, the lack of
identity enables the JVM to optimize references to value objects in many ways.
Declaring value classes
Developers can declare their own value classes by applying the value
modifier
to any class whose instances should be immutable and interchangeable:
Immutable: All instance fields of the class should be
final
, and the domain value represented by an instance will not change over time; andInterchangeable: It's not necessary to distinguish between two separately-created instances that represent the same domain value
When the value
modifier is applied to a class, its fields are implicitly
final
. The class is also implicitly final
, so cannot be extended. Because
the class is final
, its methods cannot be overridden.
There is no restriction on the types of fields in a value class. The fields may store references to other value objects, or to identity objects, e.g., strings.
Record classes are final
and all their fields are final
,
so they are often good candidates to be value classes.
jshell> value record Point(int x, int y) {}
| created record Point
jshell> Point p = new Point(17, 3)
p ==> Point[x=17, y=3]
jshell> Objects.hasIdentity(p)
$7 ==> false
jshell> new Point(17, 3) == p
$8 ==> true
Many classes represent immutable and interchangeable domain values but cannot be
record classes because they are not transparent. A record is transparent
because the fields it uses to represent a domain value are the same as the
constructor arguments used to create the domain value. Most classes, however,
use private
fields to represent a domain value internally in a more efficient
way than is exposed externally through public
methods. For example, a class
might represent a quantity of euros and cents with a single int
field to save
memory; it cannot be a record class, but it can still be a value class.
value class EURCurrency {
private int cs; // implicitly final
private EURCurrency(int cs) { this.cs = cs; }
public EURCurrency(int euros, int cents) {
this(euros * 100 + (euros < 0 ? -cents : cents));
}
public int euros() { return cs/100; }
public int cents() { return Math.abs(cs%100); }
public String toString() {
return "€%d,%d".formatted(euros(), cents());
}
}
Comparing value objects
The purpose of the ==
operator in Java is to test whether two referenced objects
are indistinguishable. If two references are ==
, the JVM can freely replace
one object with the other, and no code will be able to tell the difference.
For identity objects, the ==
operator works the same in JDK NN as in 1.0: it
checks whether two references are to the same object, at the same location in
memory.
For value objects, the ==
operator checks for statewise equivalence. This
means the two references are to objects with the same field values. Two value
objects are statewise equivalent if:
They are instances of the same value class;
Their primitive-typed fields store the same bit patterns; and
Their reference-typed fields are
==
: either twonull
references, or two references to the same identity object, or two references to statewise-equivalent value objects.
==
and equals
will often produce the same results for value objects.
However, for some value classes, instances may be interchangeable (so equals
)
even if their field values are different (so not ==
).
Developers who want to test whether two value objects represent the same
domain value should use the equals
method, and class authors should define
equals
in a way that always returns true
for interchangeable domain values.
An example where ==
and equals
may differ for value objects involves the
LazySubstring
value class below. It represents a substring of a string lazily,
without allocating a new char[]
in memory. The internal state of a
LazySubstring
instance is a source string and two coordinates, while the
domain value represented by the instance is a character sequence produced by
toString
. Accordingly, two instances may model the same character sequence (so
are equals
) even though their internal state is different (so not ==
).
value class LazySubstring {
private String str;
private int start, end;
public LazySubstring(String s, int i, int j) {
str = s; start = i; end = j;
}
public String toString() {
return str.substring(start, end);
}
public boolean equals(Object o) {
return o instanceof LazySubstring &&
toString().equals(o.toString());
}
public int hashCode() {
return Objects.hash(LazySubstring.class, toString());
}
}
jshell> LazySubstring sub1 = new LazySubstring("ringing", 1, 4);
sub1 ==> ing
jshell> LazySubstring sub2 = new LazySubstring("ringing", 4, 7);
sub2 ==> ing
jshell> sub1.equals(sub2)
$3 ==> true
jshell> sub1 == sub2
$4 ==> false
The results of ==
and equals
may also be different if two value objects'
fields refer to two identity objects that are interchangeable according to
equals
, but that have different identities.
jshell> String r = "bringing".substring(1);
r ==> ringing
jshell> r == "ringing"
$6 ==> false
jshell> LazySubstring sub3 = new LazySubstring(r, 1, 4);
sub3 ==> ing
jshell> sub1.equals(sub3)
$8 ==> true
jshell> sub1 == sub3 // tests sub1.str == sub3.str
$9 ==> false
Another situation where ==
and equals
may differ is where value objects
have float
or double
fields. The primitive floating-point types support
multiple encodings of NaN
using different bit patterns. These NaN
values are
treated as interchangeable by most floating-point operations, but because each
bit pattern is distinct, value objects that wrap different encodings of NaN
are not statewise equivalent according to ==
. The value class author must
decide whether the distinction is meaningful for the equals
method. For
example, the default behavior of equals
in a value record class does not
consider NaN
encodings to be a meaningful distinction.
jshell> value record Length(float val) {}
| created record Length
jshell> Length l1 = new Length(Float.intBitsToFloat(0x7ff80000))
l1 ==> Length[val=NaN]
jshell> Length l2 = new Length(Float.intBitsToFloat(0x7ff80001))
l2 ==> Length[val=NaN]
jshell> l1.equals(l2)
$13 ==> true
jshell> l1 == l2
$14 ==> false
jshell> Float.floatToRawIntBits(l1.val())
$15 ==> 2146959360
jshell> Float.floatToRawIntBits(l2.val())
$16 ==> 2146959361
Note that ==
performs a "deep" comparison of nested references to other value
objects. The number of comparisons is unbounded. In the following example, two
deep nests of Box
objects require a full traversal to determine whether the
objects are statewise equivalent.
jshell> value record Box(Object val) {}
| created record Box
jshell> var b1 = new Box(new Box(new Box(new Box(sub1))))
b1 ==> Box[val=Box[val=Box[val=Box[val=ing]]]]
jshell> var b2 = new Box(new Box(new Box(new Box(sub2))))
b2 ==> Box[val=Box[val=Box[val=Box[val=ing]]]]
jshell> b1.equals(b2)
$20 ==> true
jshell> b1 == b2
$21 ==> false
Constructors of value classes are constrained (discussed later) so that the
recursive application of ==
to value objects will never cause an infinite
loop.
Value classes and subclassing
Every value class belongs to a class hierarchy with java.lang.Object
at its
root, just like every identity class. There is no java.lang.Value
superclass
of all value classes.
All value classes are subclasses of java.lang.Object
and can implement
interfaces. This means variables declared with Object
, or with interfaces, can
store references to both value objects and identity objects.
jshell> Object o = LocalDate.of(1996, 1, 23)
o ==> 1996-01-23
jshell> Objects.hasIdentity(o)
$2 ==> false
jshell> Comparable<?> comp = 123
comp ==> 123
jshell> Objects.hasIdentity(comp)
$2 ==> false
jshell> comp = "abc"
comp ==> "abc"
jshell> Objects.hasIdentity(comp)
$4 ==> true
By default, a value class is implicitly final
and cannot be extended. However, a
value class may be declared abstract
, allowing it to be extended by other
classes and have its methods overridden. Methods in an abstract value class
may be marked abstract
, as in an abstract identity class.
The subclasses of an abstract value class may be value classes or identity classes.
Thus, a value class can extend either java.lang.Object
or an abstract value class.
The fields of an abstract value class are implicitly final
, as in a concrete
value class.
Many existing abstract classes are good candidates to be abstract value classes.
Applying the value
modifier to an abstract class indicates that the class has
no need for identity but does not restrict subclasses from having identity. For
example, the abstract class Number
has no fields, nor any code that depends on
identity-sensitive features, so it can be safely migrated to an abstract value
class.
abstract value class Number implements Serializable {
public abstract int intValue();
public abstract long longValue();
public byte byteValue() { return (byte) intValue(); }
...
}
Integer
(a value class) and java.math.BigInteger
(an identity class) both
extend Number
.
jshell> Number num = 123
num ==> 123
jshell> Objects.hasIdentity(num)
$6 ==> false
jshell> num = BigInteger.valueOf(123)
num ==> 123
jshell> Objects.hasIdentity(num)
$8 ==> true
An abstract value class can be sealed
to limit who can extend the class.
sealed abstract value class UserID
permits EmailID, PhoneID, UsernameID {
...
}
value record EmailID(String name, String domain) { ... }
value record PhoneID(String digits) { ... }
value record UsernameID(String name) { ... }
Safe construction for value classes
Constructors initialize newly-created objects by setting the values of their fields. Because value objects do not have identity, their initialization requires special care.
An object being constructed is "larval"—it has been created but is not yet
fully-formed. Larval objects must be handled carefully: if a larval object is
shared with code outside the constructor, then domain-specific properties of the
object may not yet hold, and the code may even observe the mutation of final
fields.
Traditionally, a constructor begins the initialization process by invoking a
superclass constructor, super(...)
. If this is not done explicitly, then the
Java compiler inserts a super()
call at the beginning of the constructor body.
After the superclass returns, the subclass proceeds to set its declared
instance fields and perform other initialization tasks. This pattern exposes
a completely uninitialized subclass to any larval object leakage that occurs
in a superclass constructor.
Flexible Constructor Bodies in Java 25 enables safer
initialization whereby fields can be set and other code executed before
the super(...)
invocation. There is a two-phase initialization process:
early construction before the super(...)
invocation, and late construction
afterwards.
During the early construction phase, larval object leakage is impossible: the
constructor may set the fields of the larval object, but may not invoke instance
methods or otherwise make use of this
. Fields that are initialized in the
early construction phase are therefore set before they can ever be read, even if
a superclass leaks the larval object. Final fields, in particular, can never be
observed to mutate.
In a value class, by default, all constructor code occurs in the early
construction phase. The Java compiler inserts a super()
call at the end of
the constructor body, not the beginning. Attempts to invoke instance methods or
otherwise use this
will fail:
value class Name {
String name;
int length;
Name(String n) {
name = n;
length = strLength(); // Error, invokes this.strLength()
}
private int strLength() {
return name.length();
}
}
Instance fields that are declared with initializer expressions are set at the start of the constructor, in the early construction phrase.
Instance initializer blocks (a rarely-used feature) are run in the late construction phase, so they cannot set instance fields in value classes.
When a constructor has code that needs to work with this
, an explicit
super(...)
or this(...)
call can be used to mark the transition from the early
to the late construction phase. All fields must be initialized before the call,
and without
referring to this
:
value class Name {
String name;
int length;
Name(String n) {
name = n;
length = strLength(name); // OK, strLength is now static
super(); // All fields must be set at this point
System.out.println("Name: " + this);
}
private static int strLength(String n) {
return n.length();
}
}
In Java 25, the fields in an identity class may only be set in the early
construction phase, not read. For convenience, in Java NN, the fields in an
identity class or a value class may be read in the early construction phase
after they have been set. As a result, both references to name
in the
constructor above are legal. It continues to be illegal in Java NN to refer to
inherited fields, invoke instance methods, or share this
with other code until
the late construction phase.
Safe construction for identity classes
In identity classes, we believe developers should write constructors and field
initializers that avoid the risk of larval object leakage by adopting early
construction constraints: read and write the declared fields of the class, but
otherwise avoid any dependency on this
, and where a dependency is necessary,
mark it as deliberate by putting it after an explicit super(...)
or
this(...)
call.
To encourage this style, javac
in JDK NN generates lint
warnings that
indicate this
dependencies in constructors of identity classes. In the future,
we anticipate that identity classes will have a way to adopt the constructor
timing of value classes. A class that compiles without the lint
warnings will
likely be able to make the transition cleanly.
Further, in Java NN, identity record classes behave the same as value record classes: their constructors always run in the early construction phase. This change is not source compatible, but based on a survey of existing record class declarations, it is not expected to be disruptive.
As an example, the following record class will fail to compile because its
canonical constructor refers to this
in the early construction phase:
record Node(String label, List<Node> edges) {
public Node {
nullCheck(label, this); // OK in Java 25, error in Java NN
nullCheck(edges, this); // OK in Java 25, error in Java NN
}
static void nullCheck(Object arg, Object owner) {
if (arg == null) {
String msg = "null arg for " + owner.toString();
throw new IllegalArgumentException(msg);
}
}
}
In cases where a record constructor needs to access this
, an explicit
super()
can be inserted, but the record's fields must be set explicitly beforehand.
Inherited methods of java.lang.Object
Like any class, a value class inherits methods like equals
, hashCode
, and
toString
from java.lang.Object
, unless the class author chooses to override
them. These methods traditionally depend on identity, but when operating on a
value object, they use the values of the object's fields instead. Specifically:
The inherited implementation of
Object.equals
uses==
to compare objects. For value objects, this tests for statewise equivalence. This might be the right <code class="prettyprint" data-shared-secret="1757189481785-0.6354758317150114">equals</code> behavior for a value class, but if it isn't then the class author should overrideequals
.The inherited implementation of
Object.hashCode
computes a hash from the object's field values. (This value can also be computed viaSystem.identityHashCode
.) As usual, thehashCode
method should be overridden by a value class whenever it overridesequals
.The inherited implementation of
Object.toString
returns a string of the form"ClassName@hashCode"
. Since value classes represent immutable domain values, most value class authors will want to overridetoString
to more legibly convey the domain value represented by the object.
In a value record, as for all records, the default equals
, hashCode
, and
toString
behavior is to recursively apply the same operations to the record
components.
A few other methods of Object
interact with value objects:
For a
Cloneable
value class, theObject.clone
method produces a value object that is indistinguishable from the original—the usual expectation thatx.clone() != x
is not meaningful for value objects. Value classes that store references to identity objects may wish to overrideclone
and perform a "deep copy" of these identity objects.The
wait
andnotify
methods require that the object be locked in the current thread; since it is impossible to synchronize on a value object, attempts to call these methods will always fail with anIllegalMonitorStateException
.The
finalize
method of a value object will never be invoked by the garbage collector.
Migrating to value classes
Value classes, and especially value records, are useful tools for modeling immutable domain values that are interchangeable when two instances represent the same value.
As a general rule, if a class with immutable state doesn't need identity, it should be made a value class. This includes abstract classes, which often have no state at all and shouldn't impose an identity requirement on their subclasses.
For final
and abstract
classes with only final
fields, applying or
removing the value
keyword is a binary-compatible change.
However, migrating from an identity class to a value class carries some risks of source and behavioral incompatibility that class authors should consider:
If the class has public constructors, users may have relied on them to create objects that are known to be distinguishable from every other object via
==
. Changing the class to be a value class will invalidate that logic, possibly leading to run-time bugs.If this incompatibility is a serious concern, it may be appropriate to deprecate the public constructors and encourage use of factory methods instead. As an example, in Java 25, the constructors of
Integer
,Float
, etc., are deprecated, and factory methods such asInteger.valueOf
are recommended instead.If users are synchronizing on instances of the class, then after migration their code will fail, either with a compile-time error or an
IdentityException
at run time. This incompatibility is more likely to be a risk for classes with public constructors, because users will generally want to be sure they "own" the object being used for locking.If the
equals
andhashCode
methods have not already been overridden, they will behave differently after migration. A good migration candidate will want to override these methods beforehand so that their behavior does not depend on identity.If the class encapsulates sensitive state, class authors should be cautious about the risk of exposing that state through
==
orSystem.identityHashCode
: a malicious user could use those operations to try to infer the internal state of an instance. Value classes are not designed to protect sensitive data against such attacks.
Run-time optimizations for value objects
At run time, the JVM can optimize value objects by encoding them in more compact forms than identity objects. Instead of allocating space in the heap for a value object, the JVM can flatten and scalarize the object.
Heap flattening: When a field of one object, or an element of an array, stores a reference to another object, the JVM can encode the other object's field values into the reference directly. When this happens, the reference is not a pointer to the other object in memory. The other object is said to be flattened.
Scalarization: When a method parameter or local variable stores a reference to an object, the JVM can encode the object's field values into additional local variables. When this happens, again, the reference is not a pointer to an object in memory. The object is said to be scalarized.
When an object is flattened or scalarized, it has no independent presence in the heap. This means it has no impact on garbage collection, and its data is always co-located in memory with the referencing object or call stack.
Heap flatteningAs an example, the JVM could flatten an array of Integer
references so that
each array element holds a reference that encodes the underlying integer value
directly, rather than pointing to the memory location of some Integer
object.
Each reference also flags whether the original Integer
reference was null
by
prepending 0
(null
) or 1
(non-null
) to the integer value.
+--------------+
| Integer[5] |
+--------------+
| 1|1996 |
| 1|2006 |
| 1|1996 |
| 0|0 |
| 0|0 |
+--------------+
Each int
value takes up 32 bits, and each null flag requires at least one
additional bit. Due to hardware constraints, the JVM will probably encode each
flattened Integer
reference as a 64-bit unit. An Integer
array thus has a
larger memory footprint than a plain int
array, but a significantly smaller
total footprint than an array of pointers to objects (the pointer itself is a
32- or 64-bit value, and each referenced object requires
at least 64 bits just for its header). Even more significantly,
all of the Integer
data is stored directly inside the array, and can be
processed without any extra memory loads.
As shown earlier, an array of LocalDate
references can be flattened by
prepending a null flag to the year-month-day triple of a LocalDate
object
(an int
and two byte
s). Like flattened Integer
references, these flattened
LocalDate
references can fit in 64 bits.
+--------------+
| LocalDate[5] |
+--------------+
| 1|1996|01|23 |
| 1|1996|01|23 |
| 1|2026|01|23 |
| 0|0000|00|00 |
| 1|1996|01|23 |
+--------------+
Fields may also store flattened references. For example, a LocalDateTime
object has two fields (a LocalDate
and a LocalTime
) and both can store a
flattened reference.
+----------------------+
| LocalDateTime |
+----------------------+
| date=1|2026|01|23 |
| time=1|09|00|00|0000 |
+----------------------+
Heap flattening must maintain the integrity of data. A flattened
reference must always be read and written atomically, or it could become
corrupted. On common platforms, this limits the size of most flattened
references to no more than 64 bits.
For example, a flattened reference to a LocalDateTime
object would embed
fields from the underlying LocalDate
and LocalTime
, plus a null flag for each,
plus a null flag for the LocalDateTime
itself. The flattened reference is likely
too big to read and write atomically, so it cannot be stored in a field
of type LocalDateTime
, e.g., the timestamp
of an Event
:
+------------------------------------------+
| Event |
+------------------------------------------+
| timestamp=1|1|2026|01|23|1|09|00|00|0000 | // Not possible
| ... |
+------------------------------------------+
Instead, the JVM stores a pointer to a LocalDateTime
object, whose own fields
may store flattened references as shown earlier:
+--------------------+
| Event |
+--------------------+
| timestamp=87fa50a0 |------> +----------------------+
| ... | | LocalDateTime |
+--------------------+ +----------------------+
| date=1|2026|01|23 |
| time=1|09|00|00|0000 |
+----------------------+
In the future, 128-bit flattened references may be possible on platforms that support atomic reads and writes of that size, or in special cases like final fields.
ScalarizationWhen the JVM sees a flattened reference in the field of an object in the heap, it needs to re-encode the reference in a form that it can readily work with. For code compiled by the JVM's just-in-time (JIT) compiler, this encoding can be a scalarized reference.
For example, consider the following code which reads a LocalDate
from an array
and invokes plusYears
. A simplified version of plusYears
is shown for reference.
LocalDate d = dates[0];
dates[0] = d.plusYears(30);
...
public LocalDate plusYears(long yearsToAdd) {
int newYear = YEAR.checkValidIntValue(this.year + yearsToAdd);
return new LocalDate(newYear, this.month, this.day);
}
In pseudo-code, the result of JIT compilation might look like the following,
using the notation { ... }
to indicate that multiple values are returned from
a JIT-compiled method. (This is purely notational; there is no wrapper at run
time.)
{ d_null, d_year, d_month, d_day } = $decode(dates[0]);
dates[0] = $encode($plusYears(d_null, d_year, d_month, d_day, 30));
static { boolean, int, byte, byte }
$plusYears(boolean this_null, int this_year,
byte this_month, byte this_day,
long yearsToAdd) {
if (this_null) throw new NullPointerException();
int newYear = YEAR.checkValidIntValue(this_year + yearsToAdd);
return { false, newYear, this_month, this_day };
}
Thanks to the JVM's optimizations, this code never touches a pointer to a
heap-allocated LocalDate
:
A flattened reference in
dates[0]
is converted to a scalarized reference by$decode(...)
A new scalarized reference is returned from
plusYears
That reference is converted to another flattened reference by
$encode(...)
Unlike heap flattening, scalarization is not constrained by the size of the
data. Local variables that are pushed and popped on the stack are not at risk of
data races. Therefore, it is possible to have a scalarized encoding of a
LocalDateTime
reference: three values and a null flag for the underlying
LocalDate
, four values and a null flag for the underlying LocalTime
, and a
null flag for the LocalDateTime
itself.
JVMs have used similar techniques to scalarize identity objects in methods when the JVM is able to prove that an object's identity is never used. Scalarization of value objects is more predictable and far-reaching, even across method boundaries.
When flattening and scalarization can occurHeap flattening and scalarization are optimizations, not language features. Programmers cannot directly control them. Like all optimizations, they occur at the discretion of the JVM. However, there are things programmers can do to make it more likely that the JVM can apply these optimizations.
First, heap flattening and scalarization rely on the JVM's knowledge that a
variable only stores a specific value class: the date
of a LocalDateTime
is
always a LocalDate
reference. Flattening and scalarization cannot typically be
applied to a variable declared with a supertype of a value class, such as
Object
.
For example, the following two arrays store the same Integer
values when they
are created, but because the second needs to be able to store arbitrary Object
references in the future, it has to encode its elements as pointers to regular
objects on the heap.
Integer[] ints = { 1996,2006,1996,null,null }; // flattenable
Object[] objs = { 1996,2006,1996,null,null }; // not flattenable
Future value objects written to the objs
array will need to be converted to
a regular heap object encoding.
Integer i = -1;
ints[3] = i; // write a flattened reference
objs[3] = i; // write a heap pointer
A field with a generic type T
usually has erased type Object
, and so will
behave at runtime just like an Object
-typed field.
record Box<T>(T field) {} // field is not flattenable
var b = new Box<Integer>(i); // field stores a heap pointer
These conversions between encodings do not have any semantic impact—the
Integer
objects referenced by objs
and field
are still value objects, and
do not have identity. The JVM is simply encoding the same value object in
different ways.
The same principles apply to method parameters: a parameter with type
LocalDate
is reliably scalarizable, while a parameter with type Object
or
T
is not. (However, if the method call can be inlined, the JIT may be able to
skip the assignment and heap allocation completely.)
A second factor that influences whether the JVM applies flattening and
scalarization is the contents of a class
file that uses value classes. When a
class is compiled, the names of value classes mentioned by its field and method
signatures get recorded in a new LoadableDescriptors
class
file attribute.
This attribute authorizes the JVM to load the named value classes early enough
to set up flattened fields and scalarized method parameters.
If a value class is not listed by LoadableDescriptors
, then when the
referencing class is loaded, the JVM may not know that it is a value class. A
field of that type may be laid out like any other field, storing regular object
pointers instead of flattened references. A method with a parameter of that type
may not be set up to accept scalarized calls, forcing callers to pass regular
object pointers.
In practice, this means classes that depend on migrated value classes will
perform the best if the updated value class declaration was available at run
time. If the class was an identity class at compile time, it will get left out
of LoadableDescriptors
, and the JVM may not be able to flatten the referencing
class's fields or scalarize its method signatures.
Value classes and the Java Platform
The Java Platform API supports value classes and value objects in the following ways:
30 classes in
java.*
are declared as value classes.In
java.lang
:Integer
,Long
,Float
,Double
,Byte
,Short
,Character
,Boolean
, and the abstract classesNumber
andRecord
In
java.util
:Optional
,OptionalInt
,OptionalLong
,OptionalDouble
In
java.time
:Duration
,Instant
,LocalTime
,Year
,YearMonth
,MonthDay
,Period
,LocalDate
,LocalDateTime
,OffsetTime
,OffsetDateTime
,ZonedDateTime
In
java.time.chrono
:MinguoDate
,HijrahDate
,JapaneseDate
,ThaiBuddhistDate
To minimize compatibility risks, these classes have long discouraged reliance on the identities of instances, and have been documented as value-based. They have also prevented or discouraged instance creation through constructors. Since Java 16, Warnings for Value-Based Classes have discouraged the use of synchronization with these classes.
The vast majority of Platform APIs work seamlessly with value objects. Methods that operate on
Object
orObject[]
parameters accept value objects. Almost anywhere a user needs to provide an implementation of an interface, the implementation may be a value class. Generic APIs such asList<T>
andComparable<T>
can be parameterized with value classes as the type arguments.New methods in
java.util.Objects
(hasIdentity
,requireIdentity
) allow developers to distinguish between identity objects and value objects.A new constant in
java.lang.reflect.AccessFlag
exposes whether a class is an identity class or a value class.Whether a class is an identity class or a value class is recorded in its
class
file. Identity classes have theACC_IDENTITY
flag set; value classes do not. This flag supersedes the legacyACC_SUPER
flag. The JVM Specification always recommended that compilers and tools set theACC_SUPER
flag inclass
files, so by default, compilers and tools can continue to set the flag in newclass
files and generate identity classes.Serialization works with value records out of the box, but serialization of non-record value classes requires developer attention. Namely, value classes that implement
Serializable
must implement the <code class="prettyprint" data-shared-secret="1757189481785-0.6354758317150114">writeReplace</code> and <code class="prettyprint" data-shared-secret="1757189481785-0.6354758317150114">readResolve</code> methods. This causes a replacement object to be serialized and deserialized instead of the value object. If these methods are not implemented, attempts to serialize or deserialize the value object will fail with anInvalidClassException
.These methods must be implemented because value classes are compiled using strictly-initialized fields, and deserialization does not safely initialize these fields. Value objects may only be created, and their fields initialized, by invoking a constructor. In the future, enhancements to the serialization mechanism are anticipated that will allow a
Serializable
value class to be serialized and deserialized automatically.Deep reflection on value objects is not possible. Libraries that modify
final
fields viaField.setAccessible
are incompatible with safe construction and will not be able to modify value class fields, even if <code class="prettyprint" data-shared-secret="1757189481785-0.6354758317150114">--enable-final-field-mutation</code> is used on the command line. Libraries must initialize instances of a value class using the class's constructors.The garbage collection APIs in
java.lang.ref
andjava.util.WeakHashMap
do not allow developers to manually manage value objects in the heap. Attempts to createReference
objects for value objects throwIdentityException
at run time.javac
producesidentity
warnings about uses of the API with value classes at compile time.Since JDK 25,
javac
has produced <code class="prettyprint" data-shared-secret="1757189481785-0.6354758317150114">identity</code> warnings about value-based classes being used with these APIs.
Future Work
Null-Restricted Value Class Types (Preview) will build on this JEP, allowing programmers to manage the storage of nulls and enable more dense heap flattening in fields and arrays.
Enhanced Primitive Boxing (Preview) will enhance the language's use of primitive types, taking advantage of the lighter-weight characteristics of boxing to value objects.
JVM class and method specialization (JEP 218, with revisions) will allow generic classes and methods to specialize field, array, and local variable layouts when parameterized by value class types.
Alternatives
As discussed, JVMs have long performed escape analysis to identify objects that never rely on identity throughout their lifespan and can be scalarized. These optimizations are somewhat unpredictable, and do not help with objects that escape the scope of the optimization, including storage in fields and arrays.
Hand-coded optimizations via primitive values are possible to improve performance, but as noted in the "Motivation" section, these techniques require giving up valuable abstractions.
The C language and its relatives support flattened storage for struct
s and
similar class-like abstractions. For example, the C# language has
value types.
Unlike value objects, instances of these abstractions have identity, meaning
they support operations such as field mutation. As a result, the semantics of
copying on assignment, invocation, etc., must be carefully specified, leading to
a more complex user model and less flexibility for runtime implementations. We
prefer an approach that leaves these low-level details to the discretion of JVM
implementations.
Risks and Assumptions
The feature makes significant changes to the Java object model. Developers may
be surprised by, or encounter bugs due to, changes in the behavior of operations
such as ==
and synchronized
. We expect such disruptions to be rare and
tractable.
Some changes could potentially affect the performance of identity objects. The
if_acmpeq
test, for example, typically only costs one instruction
cycle, but will now need an additional check to detect value objects. But the
identity class case can be optimized as a fast path, and we believe we have
minimized any performance regressions.
There is a security risk that ==
and hashCode
can indirectly expose
private
field values. Further, two large trees of value objects can take
unbounded time to compute ==
. Developers need to understand these risks.
Dependencies
Strict Field Initialization in the JVM (Preview) provides the JVM mechanism necessary to require, through verification, that value class instance fields are initialized during the early construction phase.
- relates to
-
JDK-8277163 Value Objects (Preview)
-
- Closed
-