-
JEP
-
Resolution: Unresolved
-
P3
-
None
-
None
-
Feature
-
Open
-
SE
-
-
XL
-
XL
-
401
Summary
Enhance the Java Platform with value objects, class instances that have
only final
fields and lack object identity.
This is a preview language and VM feature.
Goals
Allow developers to opt in to a programming model for domain values in which objects are distinguished solely by the values of their fields, much as the
int
value3
is distinguished from theint
value4
.Migrate popular classes that represent domain values in the standard library, such as
Integer
andLocalDate
, to this programming model. Support compatible migration of user-defined classes.Maximize the freedom of the JVM to store domain values in ways that improve memory footprint, locality, and garbage collection efficiency.
Non-Goals
It is not a goal to automatically treat existing classes as value classes, even if they meet the requirements for how value classes are declared and used. The behavioral changes require an explicit opt-in.
It is not a goal to "fix" the
==
operator so that programmers can use it in place ofequals
. This JEP redefines==
only as much as necessary to cope with a new kind of identity-free object. The usual advice to compare objects in most contexts using theequals
method still applies.It is not a goal to introduce a
struct
feature in the Java language. Java programmers are not asked to understand new semantics for memory management or variable storage. Java continues to operate on just two kinds of data: primitives and object references.It is not a goal to change the treatment of primitive types. Primitive types behave like value classes in many ways, but are a distinct concept. A separate JEP will provide enhancements to make primitive types more class-like and compatible with generics.
It is not a goal to guarantee any particular optimization strategy or memory layout. This JEP enables many potential optimizations; only some will be implemented initially. Some potential optimizations, such as layouts that exclude null, will only be possible after future language and JVM enhancements.
Motivation
Java developers often need to represent domain values: the date of an event, the
color of a pixel, the shipping address of an order, and so on. Developers
usually model them with immutable classes that contain just enough business
logic to construct, validate, and transform values. Notably, the toString
,
equals
, and hashCode
methods are defined so that equivalent instances can be
used interchangeably.
As an example, event dates can be represented with instances of the <code class="prettyprint" data-shared-secret="1752881057001-0.44494596132277175">LocalDate</code> JDK class:
jshell> LocalDate d1 = LocalDate.of(1996, 1, 23)
d1 ==> 1996-01-23
jshell> LocalDate d2 = d1.plusYears(30)
d2 ==> 2026-01-23
jshell> LocalDate d3 = d2.minusYears(30)
d3 ==> 1996-01-23
jshell> d1.equals(d3)
$4 ==> true
Developers will regard the "essence" of a LocalDate
object as its year, month,
and day values. But to Java, the essence of any object is its identity.
Each time a method in LocalDate
invokes new LocalDate(...)
, an
object with a unique identity is allocated, distinguishable from every other
object in the system.
The easiest way to observe the identity of an object is with the ==
operator:
jshell> d1 == d3
$6 ==> false
Even though d1
and d3
represent the same year-month-day triple, they are
two objects with distinct identities.
For mutable objects, identity is important: it lets us distinguish two objects
that have the same state now but will have different state in the future. For
example, suppose a class Customer
has a field lastOrderedDate
that is
mutated when the customer makes a new order. Two Customer
objects might have
the same lastOrderedDate
, but it would be a coincidence; when a customer makes
a new order, the application will mutate the lastOrderedDate
of one object but
not the other, relying on identity to pick the right one.
In other words, when objects are mutable, they are not interchangeable. But most
domain values are not mutable and are interchangeable. There is no practical
difference between two LocalDate
objects representing 1996-01-23
, because
their state is fixed and unchanging. They represent the same domain value, both
now and in the future. There is no need to distinguish the two objects via their
identities.
In fact, object identity is often harmful if the objects in question have
immutable state and are meant to be interchangeable. This is because it can
cause significant confusion when developers accidentally stumble on it,
discovering that d1 == d3
is false
.
For JDK classes that model primitive values, such as Integer
, the JDK uses a
cache to avoid creating objects with unique identities. However, this cache,
somewhat arbitrarily, does not extend to four-digit Integer
values like
1996
:
jshell> Integer i = 96, j = 96;
i ==> 96
j ==> 96
jshell> i == j
$3 ==> true
jshell> Integer x = 1996, y = 1996;
x ==> 1996
y ==> 1996
jshell> x == y
$6 ==> false
For domain values like Integer
, the fact that each object has unique identity
is unwanted complexity that leads to surprising behavior and exposes
incidental implementation choices. This extra complexity could be avoided if the
language did not insist that separately-created but interchangeable objects have
distinct identities.
Java's requirement that every object have identity, even if some domain values don't want it, is a performance impediment. Typically, the JVM has to allocate memory for each newly created object, distinguishing it from every object already in the system, and then reference that memory location whenever the object is used or stored.
For example, while an array of int
values can be represented as a simple
block of memory, an array of LocalDate
values must be represented as a
sequence of pointers, each referencing a memory location where an object has
been allocated.
+----------+
| int[5] |
+----------+
| 1996 |
| 2006 |
| 1996 |
| 1 |
| 23 |
+----------+
+--------------+
| LocalDate[5] |
+--------------+
| 87fa1a09 | -----------------------> +-----------+
| 87fa1a09 | -----------------------> | LocalDate |
| 87fb4ad2 | ------> +-----------+ +-----------+
| 00000000 | | LocalDate | | y=1996 |
| 87fb5366 | --- +-----------+ | m=1 |
+--------------+ | | y=2026 | | d=23 |
v | m=1 | +-----------+
+-----------+ | d=23 |
| LocalDate | +-----------+
+-----------+
| y=1996 |
| m=1 |
| d=23 |
+-----------+
Even though the data modeled by an array of LocalDate
values is not
significantly more complex than an array of int
values (a year-month-day
triple is, effectively, 48 bits of primitive data), the memory footprint is far
greater.
Worse, when a program iterates over the LocalDate
array, each pointer may need
to be dereferenced. CPUs use caches to enable fast access to chunks of memory;
if the array exhibits poor
memory locality
(a distinct possibility if the LocalDate
objects were allocated at different
times or out of order), every dereference may require caching a different
chunk of memory, frustrating performance.
In some application domains, developers routinely program for speed by creating
as few objects as possible, thus de-stressing the garbage collector and
improving locality. For example, they might encode event dates with an int
representing an epoch day. Unfortunately, this approach gives
up the functionality of classes that makes Java code so maintainable:
meaningful names, private state, data validation by constructors, convenience
methods, etc. A developer operating on dates represented as int
values might
accidentally interpret the value in terms of a starting date in
1601 or 1980
rather than the intended 1970 start date.
Trillions of Java objects are created every day, each one bearing a unique
identity. We believe the time has come to let Java developers choose which
objects in the program need identity, and which do not. A class like LocalDate
that represents domain values could opt out of identity, so that it would be
impossible to distinguish between two LocalDate
objects representing the date
1996-01-23
, just as it is impossible to distinguish between two int
values
representing the number 4
.
By opting out of identity, developers are opting in to a programming model that provides the best of both worlds: the abstraction of classes with the simplicity and performance benefits of primitives.
Description
Java programs manipulate objects through references. A reference to an object
is stored in a variable and lets us find the object's fields, which are merely
variables that store primitive values or references to other objects.
Traditionally, a reference also serves as the unique identity of an object: each
execution of new
allocates a fresh object and returns a unique reference that
can be stored in one variable and copied to other variables (aliasing).
Famously, the ==
operator compares objects by comparing references, so
references to two objects are not ==
even if the objects have identical field
values.
JDK NN introduces value objects to model immutable domain values. A reference
to a value object is stored in a variable and lets us find the object's fields,
but it does not serve as the unique identity of the object. Executing new
might not allocate a fresh object and might instead return a reference to an
existing object, or even a "reference" that embodies the object directly. The
==
operator compares value objects by comparing their field values, so
references to two objects are ==
if the objects have identical field values.
A value object is an instance of a value class, declared with the value
modifier. Classes without the value
modifier are called identity classes,
and their instances are identity objects.
Developers can save memory and improve performance by using value objects for
immutable data. Because programs cannot tell the difference between two objects
with identical field values (not even with ==
), the Java Virtual Machine is
able to avoid allocating multiple objects for the same data. Furthermore, the
JVM can change how a value object is laid out in memory without affecting the
program; for example, its fields could be stored on the stack rather than the
heap.
The following sections explore how value objects differ from identity objects and illustrate how to declare value classes. This is followed by an in-depth treatment of the special behaviors of value objects, the JVM's run-time optimizations, and considerations for value class declarations.
Enabling preview features
Value classes and objects are a preview language feature, disabled by default.
To try the examples below in JDK NN you must enable preview features:
Compile the program with
javac --release NN --enable-preview Main.java
and run it withjava --enable-preview Main
; or,When using the source code launcher, run the program with
java --enable-preview Main.java
; or,When using jshell, start it with
jshell --enable-preview
.
Programming with value objects
With preview features enabled, a handful of classes in the JDK, including
Integer
and LocalDate
, are treated as value classes. All instances of these
classes are value objects.
In jshell
, the Objects.hasIdentity
method can be used to observe that these
objects lack identity, and the ==
operator now shows that Integer
and
LocalDate
objects with same field values are equivalent.
% -> jshell --enable-preview
| Welcome to JShell -- Version 25-internal
| For an introduction type: /help intro
jshell> Integer x = 1996, y = 1996;
x ==> 1996
y ==> 1996
jshell> Objects.hasIdentity(x)
$3 ==> false
jshell> x == y
$4 ==> true
jshell> LocalDate d1 = LocalDate.of(1996, 1, 23)
d1 ==> 1996-01-23
jshell> LocalDate d2 = d1.plusYears(30)
d2 ==> 2026-01-23
jshell> LocalDate d3 = d2.minusYears(30)
d3 ==> 1996-01-23
jshell> Objects.hasIdentity(d1)
$8 ==> false
jshell> d1 == d3
$9 ==> true
For a variety of compatibility and implementation reasons, the String
class is
not made a value class. Strings have identity, and different String
objects
representing the same character sequence might not be ==
.
jshell> String s = "abcd", t = "aabcd".substring(1);
s ==> "abcd"
t ==> "abcd"
jshell> Objects.hasIdentity(s)
$12 ==> true
jshell> s == t
$13 ==> false
Value objects are objects. They support field reads, method invocations, and
assignment to supertypes. Every object in a program, whether it has identity or
not, belongs to the type Object
. Typically, code that interacts with value
objects need not know or care that they lack identity.
jshell> d1.getDayOfWeek()
$4 ==> TUESDAY
jshell> Temporal t1 = d1
t1 ==> 1996-01-23
jshell> t1.with(TemporalAdjusters.firstDayOfMonth())
$15 ==> 1996-01-01
jshell> Object[] arr = { d1, s, d3 }
arr ==> Object[3] { 1996-01-23, "abcd", 1996-01-23 }
jshell> arr[0].equals(arr[1])
$11 ==> false
jshell> arr[0].equals(arr[2])
$10 ==> true
However, certain identity-sensitive operations, such as synchronization, are not supported by value objects.
jshell> synchronized (d1) { d1.notify(); }
| Error:
| unexpected type
| required: a type with identity
| found: java.time.LocalDate
| synchronized (d1) { d1.notify();}
| ^-------------------------------^
jshell> synchronized (arr[0]) { arr[0].notify(); }
| Exception java.lang.IdentityException: Cannot synchronize on
an instance of value class java.time.LocalDate
| at (#19:1)
JVMs have a lot of freedom to encode references to value objects at run time in ways that optimize memory footprint, locality, and garbage collection efficiency. For example, we saw the following array earlier, implemented with pointers to heap objects:
jshell> LocalDate[] dates = { d1, d1, d2, null, d3 }
dates ==> LocalDate[5] { 1996-01-23, 1996-01-23, 2026-01-23,
null, 1996-01-23 }
Now that LocalDate
objects lack identity, the JVM could implement the array
using "references" that encode the fields of each LocalDate
directly. Each
array component can be represented as a 64-bit word that indicates whether the
reference is null, and if not, directly stores the year, month, and day field
values of the value object:
+--------------+
| LocalDate[5] |
+--------------+
| 1|1996|01|23 |
| 1|1996|01|23 |
| 1|2026|01|23 |
| 0|0000|00|00 |
| 1|1996|01|23 |
+--------------+
The performance characteristics of this array are on par with an ordinary int
array:
+----------+
| int[5] |
+----------+
| 1996 |
| 2006 |
| 1996 |
| 1 |
| 23 |
+----------+
Some value classes, like LocalDateTime
, are too large to take advantage of
this particular technique. But the lack of identity enables the JVM to optimize
those classes in other ways.
Declaring value classes
Developers can declare their own value classes by applying the value
modifier
whenever a class has immutable instances that can be treated as interchangeable
whenever their field values match. Records are often great candidates to be
value classes.
jshell> value record Point(int x, int y) {}
| created record Point
jshell> Point p = new Point(17, 3)
p ==> Point[x=17, y=3]
jshell> Objects.hasIdentity(p)
$7 ==> false
jshell> new Point(17, 3) == p
$8 ==> true
Records are transparent data carriers; but some value classes have
private internal state, and are better expressed with a normal class
declaration. For example, the Substring
value class, below, represents a
string lazily, without allocating a char[]
in memory. The external state of
a Substring
object is the character sequence it represents, while the internal
state includes the original string and a pair of indices.
value class Substring implements CharSequence {
private String str;
private int start, end;
public Substring(String str, int start, int end) {
// (simplification, skipping argument validation)
this.str = str;
this.start = start;
this.end = end;
}
public int length() { return end - start; }
public char charAt(int i) { return str.charAt(start+i); }
public Substring subSequence(int i, int j) {
// (simplification, skipping argument validation)
return new Substring(str, start+i, start+j);
}
public String toString() {
return str.substring(start, end);
}
public boolean equals(Object o) {
return o instanceof Substring &&
toString().equals(o.toString());
}
public int hashCode() {
return Objects.hash(Substring.class, toString());
}
}
The fields of a value class are always implicitly final
. They may store
references, both to identity objects and to other value objects. In
constructors, the fields of a value class must be initialized without reference
to this
.
The class itself is also implicitly final
, ensuring that all instances of the
class have the same layout.
Notice that the ==
operator always compares value objects' field values, even
when those fields store private internal state. So two Substring
objects may
represent the same character sequence without being ==
.
jshell> Substring sub1 = new Substring("ionization", 0, 3);
sub1 ==> ion
jshell> Substring sub2 = new Substring("ionization", 7, 10);
sub2 ==> ion
jshell> sub1 == sub2
$3 ==> false
jshell> sub1.equals(sub2)
$4 ==> true
As this example illustrates, the ==
operator does not necessarily align with
intuitions about whether the two objects represent the same data. Usually, the
right way to compare objects—whether they have identity or not—is with equals
.
Value classes in the JDK
With preview features enabled, the following classes in the JDK are treated as value classes. All instances of these classes are value objects.
In
java.lang
:Integer
,Long
,Float
,Double
,Byte
,Short
,Boolean
, andCharacter
In
java.util
:Optional
,OptionalInt
,OptionalLong
, andOptionalDouble
In
java.time
:LocalDate
,Period
,Year
,YearMonth
,MonthDay
,LocalTime
,Instant
,Duration
,LocalDateTime
,OffsetTime
,OffsetDateTime
,ZonedDateTime
In
java.time.chrono
:HijrahDate
,JapaneseDate
,MinguoDate
, andThaiBuddhistDate
Because the wrapper classes (Integer
, Long
, etc.) are on this list,
boxed primitives are always value objects.
In contrast, String
instances, including string literals, continue to have
identity.
Identity-sensitive operations
Value objects are objects. They support field reads, method invocations, and
assignment to supertypes. Every object in a
program, whether it has identity or not, belongs to the type Object
.
Typically, code that interacts with value objects need not know or care that
they lack identity.
There are, however, a few distinct behaviors that users of value classes may observe:
A new
Objects.hasIdentity
method returnsfalse
for value objects.Objects.requireIdentity
is also available, throwing anIdentityException
when given a value object.jshell> String s = "abcd" s ==> "abcd" jshell> Integer i = 1234 i ==> 1234 jshell> Objects.hasIdentity(s) $3 ==> true jshell> Objects.hasIdentity(i) $4 ==> false jshell> Objects.requireIdentity(s) $5 ==> "abcd" jshell> Objects.requireIdentity(i) | Exception java.lang.IdentityException: java.lang.Integer is not an identity class | at Objects.requireIdentity (Objects.java:213) | at (#6:1)
The
==
and!=
operators treat two separately-created value objects that have the same field values as equivalent, where otherwise they might be considered distinct. This behavior is the same whether the operands have a value object type or are typed as a supertype, likeObject
.A user of a value object should never assume unique ownership of that object.
jshell> "aabcd".substring(1) == s $7 ==> false jshell> (Integer) (i*1) == d $8 ==> true jshell> Object o = i, o2 = i*1 o ==> 1234 o2 ==> 1234 jshell> o1 == o2 $11 ==> true
The precise behavior of
==
is described in more detail in the next section.The
System.identityHashCode
method hashes together a value object's class and its field values, ensuring that the same hash code is returned for equivalent value objects.jshell> System.identityHashCode(s) $12 ==> 1106131243 jshell> System.identityHashCode("aabcd".substring(1)) $13 ==> 1473611564 jshell> System.identityHashCode(i) $14 ==> -520941578 jshell> System.identityHashCode(i*1) $15 ==> -520941578
Value objects cannot be used for synchronization. Attempts to synchronize on a value class type are rejected at compile time; attempts to synchronize on a value object typed as a supertype (like
Object
) will fail at run time with anIdentityException
.Similarly, the usual understandings of object lifespan and garbage collection do not apply to value objects, because an equivalent instance may be recreated at any point. So the
java.lang.ref
API produces warnings about value class types at compile time, and throws anIdentityException
at run time.jshell> synchronized (s) { s.notify(); } jshell> synchronized (i) { i.notify(); } | Error: | unexpected type | required: a type with identity | found: java.lang.Integer | synchronized (i) { i.notify();} | ^-----------------------------^ jshell> synchronized (o) { o.notify(); } | Exception java.lang.IdentityException: Cannot synchronize on an instance of value class java.lang.Integer | at (#17:1) jshell> new WeakReference<>(s) $18 ==> java.lang.ref.WeakReference@145eaa29 jshell> new WeakReference<>(i) | Warning: | use of a value class with an operation that expects reliable identity | new WeakReference<>(i) | ^ | Exception java.lang.IdentityException: java.lang.Integer is not an identity class | at Objects.requireIdentity (Objects.java:213) | at Reference.<init> (Reference.java:553) | at Reference.<init> (Reference.java:548) | at WeakReference.<init> (WeakReference.java:70) | at (#19:1)
In anticipation of these new behaviors, the value classes in the standard library have long been marked as value-based, warning that users should not depend on the unique identities of instances. Programs that have followed this advice, avoiding identity-sensitive operations on these objects, can expect consistent behavior between releases.
Comparing objects with==
The ==
operator tests whether two objects are substitutable. This means
that identical operations performed on the two objects will always produce
identical results—it is impossible for a program to distinguish between the two.
For an identity object, this can only be true for the object itself: o1 == o2
only if o1
and o2
have the same unique identity.
For a value object, this is true whenever the two objects are instances of the
same class and have substitutable field values. Primitive field values are
considered substitutable if they have the same bit patterns; reference field
values are compared recursively with ==
.
jshell> var opt1 = OptionalDouble.of(1.234)
opt1 ==> OptionalDouble[1.234]
jshell> var opt2 = OptionalDouble.of(1.234*1.0)
opt2 ==> OptionalDouble[1.234]
jshell> opt1 == opt2
$3 ==> true
jsahell> opt1.equals(opt2)
$4 ==> true
jshell> Float f1 = Float.intBitsToFloat(0x7ff80000)
f1 ==> NaN
jshell> Float f2 = Float.intBitsToFloat(0x7ff80001)
f2 ==> NaN
jshell> f1 == f2
$7 ==> false
jshell> f1.equals(f2)
$8 ==> true
jshell> var opt3 = Optional.of("abcd")
opt3 ==> Optional[abcd]
jshell> var opt4 = Optional.of("aabcd".substring(1))
opt4 ==> Optional[abcd]
jshell> opt3 == opt4
$11 ==> false
jshell> opt3.equals(opt4)
$12 ==> true
When a value object contains a reference to an identity object (like opt3
and
opt4
, above), the recursive application of ==
simply compares the nested
objects' identities. But when a value object contains a reference to another
value object, the ==
operator's recursion performs a "deep" comparison of the
nested objects' fields. The number of comparisons is unbounded: given a
deeply-nested stack of Optional
wrappers, using ==
may require a full
traversal of that stack.
jshell> var opt5 = Optional.of(f1)
opt5 ==> Optional[NaN]
jshell> var opt6 = Optional.of(f2)
opt6 ==> Optional[NaN]
jshell> opt5 == opt6
$15 ==> false
jsahell> opt5.equals(opt6)
$16 ==> true
jshell> var opt7 = Optional.of(Optional.of(opt1))
opt7 ==> Optional[Optional[OptionalDouble[1.234]]]
jshell> var opt8 = Optional.of(Optional.of(opt2))
opt8 ==> Optional[Optional[OptionalDouble[1.234]]]
jshell> opt7 == opt8
$19 ==> true
jsahell> opt7.equals(opt8)
$20 ==> true
The result of a ==
comparison may be surprising, even for value objects. It
does not necessarily align with intuitions about whether the two objects
represent the same data. Usually, the right way to compare objects—whether they
have identity or not—is with equals
.
Run-time optimizations for value objects
Because there is no need to preserve identity, Java Virtual Machine implementations have a lot of freedom to encode references to value objects at run time in ways that optimize memory footprint, locality, and garbage collection efficiency. Optimization techniques will typically duplicate, re-encode, or re-use value objects to achieve these goals. Re-encoding might be useful, for example, to store a value object's field values directly in a variable, reducing the number of memory loads to access the object's data.
This section describes abstractly some of the JVM optimization techniques implemented by HotSpot. It is not comprehensive or prescriptive, but offers a taste of how value objects enable improved performance.
Value object scalarizationScalarization is one important optimization enabled by the lack of identity. A scalarized reference to a value object is reduced to its "essence", a set of the object's field values without any enclosing container. A scalarized reference is essentially "free" to create and use at run time, having no impact on the normal object allocation and garbage collection processes.
In HotSpot, scalarization is a JIT compilation technique, affecting the representation of references to value objects in the bodies and signatures of JIT-compiled methods.
The following illustrates how the JIT compiler might translate the
LocalDate.addYears
method to scalarize its input and output. The "essence" of
a LocalDate
reference is an int
and two byte
values representing a
year-month-day triple, along with a boolean
to indicate whether the reference
is null
—in which case the three fields can be ignored. (In this pseudocode,
the notation { ... }
refers to a vector of multiple values that can be
returned from a scalarized method. Importantly, this is purely notational: there
is no wrapper at run time.)
// original method:
public LocalDate plusYears(long yearsToAdd) {
// avoid overflow:
int newYear = YEAR.checkValidIntValue(this.year + yearsToAdd);
// (simplification, skipping leap year adjustment)
return new LocalDate(newYear, this.month, this.day);
}
// effectively:
static { boolean, int, byte, byte }
$plusYears(boolean this_null, int this_year,
byte this_month, byte this_day,
long yearsToAdd) {
if (this_null) throw new NullPointerException();
int newYear = YEAR.checkValidIntValue(this_year + yearsToAdd);
return { false, newYear, this_month, this_day };
}
// original invocation:
LocalDate.of(1996, 1, 23).plusYears(30);
// effectively:
$plusYears(false, 1996, 1, 23, 30);
JVMs have used similar techniques to scalarize identity objects in local code when the JVM is able to prove that an object's identity is never used. But scalarization of value objects is more predictable and far-reaching, even across non-inlinable method invocation boundaries.
One limitation of scalarization is that it is not typically applied to a
variable with a type that is a supertype of a value class type. Notably, this
includes method parameters of generic code whose erased type is Object
.
Instead, when an assignment to a supertype occurs, a scalarized value object
reference may be converted to an ordinary heap object reference. But this
allocation occurs only when necessary, and as late as possible.
Heap flattening is another important optimization enabled by value objects' lack of identity. The "essence" of a reference to a value object is encoded as a compact bit vector, without any pointer to a different memory location. This bit vector can then be stored directly on the heap, in a field or an array of a value class type.
Heap flattening is useful because a flattened value object reference requires less memory than a pointer to a separately-allocated object, and because the data is stored locally, avoiding expensive cache misses. These benefits can significantly improve some programs' memory footprint and execution time.
To illustrate, an array of LocalDate
references could directly store 64-bit
encodings of the referenced objects. Note that, as for scalarization, an extra
flag is needed to keep track of null
references.
A pointer-based array might look like this:
+--------------+
| LocalDate[5] |
+--------------+
| 87fa1a09 | -----------------------> +-----------+
| 87fa1a09 | -----------------------> | LocalDate |
| 87fb4ad2 | ------> +-----------+ +-----------+
| 00000000 | | LocalDate | | y=1996 |
| 87fb5366 | --- +-----------+ | m=1 |
+--------------+ | | y=2026 | | d=23 |
v | m=1 | +-----------+
+-----------+ | d=23 |
| LocalDate | +-----------+
+-----------+
| y=1996 |
| m=1 |
| d=23 |
+-----------+
While a flattened array could look like:
+--------------+
| LocalDate[5] |
+--------------+
| 1|1996|01|23 |
| 1|1996|01|23 |
| 1|2026|01|23 |
| 0|0000|00|00 |
| 1|1996|01|23 |
+--------------+
Reads from the array essentially produce long
values, which can then be mapped
to scalarized references (as in the previous section) or pointers to
heap-allocated objects.
The details of flattened encodings will vary, of course, at the discretion of the JVM implementation.
Heap flattening must maintain the integrity of objects. For example, the
flattened data must be small enough to read and write atomically, or else it may
become corrupted. On common platforms, "small enough" may mean as few as 64
bits, including the null flag. So while many small value classes can be
flattened, classes like Double
and LocalDateTime
might have to be encoded
as ordinary heap objects.
In the future, 128-bit flattened encodings may be used on platforms that support atomic reads and writes of that size. And the Null-Restricted Value Types JEP will enable heap flattening for even larger value classes if the programmer is willing to opt out of atomicity guarantees.
Declaring value classes
A program can declare its own value classes by applying the value
modifier
to classes that represent domain values. These classes opt out of identity.
For example, a class representing US dollar currency values (to two decimal places) might be a good value class candidate, assuming the class is immutable and treats instances representing the same currency value as interchangeable. Such a class has no use for identity.
Many developers might opt for using a float
or an int
to encode currency
values instead, but by wrapping a primitive with a value class, the author is
able to take advantage of the powerful features of classes without a significant
performance penalty.
value class USDCurrency implements Comparable<USDCurrency> {
private int cs; // implicitly final
private USDCurrency(int cs) { this.cs = cs; }
public USDCurrency(int dollars, int cents) {
this(dollars * 100 + (dollars < 0 ? -cents : cents));
}
public int dollars() { return cs/100; }
public int cents() { return Math.abs(cs%100); }
public USDCurrency plus(USDCurrency that) {
return new USDCurrency(cs + that.cs);
}
public int compareTo(USDCurrency that) {
return cs - that.cs;
}
}
The instance fields of a value class are implicitly final
. The instance
methods of a value class must not be synchronized
.
A concrete value
class cannot be extended and is implicitly final
. This
ensures that all instances of the class have the same layout.
In most other respects, a value class declaration is just like any other class declaration. It may declare constructors, static members, type parameters, and initializers. It may be a top-level class, a member class, or a local class.
Value recordsRecord classes are transparent data carriers, and are often great candidates to be value classes.
The following record declaration provides a lightweight wrapper around 24-bit
color values. Rather than using a plain int
to encode colors, this approach
provides a rich API for manipulating color values while ensuring components are
interpreted correctly. Yet the JVM is able to encode references to color objects
so that they have no more overhead than a plain int
.
value record Color(byte red, byte green, byte blue) {
public Color(int r, int g, int b) {
this(checkByte(r), checkByte(g), checkByte(b));
}
private static byte checkByte(int x) {
if (x < 0 || x > 255)
throw new IllegalArgumentException();
return (byte) (x & 0xff);
}
public Color mix(Color that) {
return new Color(avg(red, that.red),
avg(green, that.green),
avg(blue, that.blue));
}
private static byte avg(byte b1, byte b2) {
return (byte) (((b1 & 0xff) + (b2 & 0xff)) / 2);
}
}
Not every record should be a value record. For some records, identity is
important. Thus, value records require explicit use of the value
modifier,
just like classes. The record
keyword opts out of private, internal state,
while the value
keyword opts out of identity.
As described earlier, the ==
operation compares the internal state of value
objects (that is, the data that they store in fields). By default, the equals
method of a value class, inherited from Object
, performs this same comparison.
The USDCurrency
class, declared above, has no need to override equals
because the inherited behavior—comparing two cs
fields—is appropriate.
Sometimes, the external state of a value class (that is, the data that it
represents) differs from its internal state, and so it is necessary to override
equals
, just as one would do when declaring an identity class.
In the following example, the value class Substring
implements CharSequence
.
A Substring
represents a string lazily, without allocating a char[]
in
memory. Naturally, then, two Substring
objects should be considered equal
if they represent the same string, regardless of differences in their internal
state.
value class Substring implements CharSequence {
private String str;
private int start, end;
public Substring(String str, int start, int end) {
// (simplification, skipping argument validation)
this.str = str;
this.start = start;
this.end = end;
}
public int length() { return end - start; }
public char charAt(int i) { return str.charAt(start+i); }
public Substring subSequence(int i, int j) {
// (simplification, skipping argument validation)
return new Substring(str, start+i, start+j);
}
public String toString() {
return str.substring(start, end);
}
public boolean equals(Object o) {
return o instanceof Substring &&
toString().equals(o.toString());
}
public int hashCode() {
return Objects.hash(Substring.class, toString());
}
}
jshell> Substring s1 = new Substring("ionization", 0, 3);
s1 ==> ion
jshell> Substring s2 = new Substring("ionization", 7, 10);
s2 ==> ion
jshell> s1 == s2
$3 ==> false
jshell> s1.equals(s2)
$4 ==> true
Similarly to equals
, the inherited hashCode
method delegates to
System.identityHashCode
, which produces a hash of a value object's internal
state; naturally, any class that overrides equals
should also override
hashCode
.
In a value record, as for all records, the implicit equals
recursively applies
equals
to the record components. This behavior does not always match the
behavior of ==
—for example, if a value record has a String
component, ==
compares the strings' identities, while equals
compares their contents.
jshell> value record Country(String code) {}
| created record Country
jshell> Country c1 = new Country("SWE")
c1 ==> Country[code=SWE]
jshell> Country c2 = new Country("SWE")
c2 ==> Country[code=SWE]
jshell> Country c3 = new Country("SWEDEN".substring(0,3))
c3 ==> Country[code=SWE]
jshell> c1 == c2
$9 ==> true
jshell> c1 == c3
$10 ==> false
jshell> c1.equals(c2)
$11 ==> true
jshell> c1.equals(c3)
$12 ==> true
It is always the responsibility of the class author to ensure the equals
method appropriately interprets the class's external state. Sometimes the
default behavior is appropriate, and sometimes an explicit equals
needs to
be declared.
A few other methods of Object
are of special interest to value classes:
The default
toString
has the usual form,"ClassName@hashCode"
, but note that the defaulthashCode
of a value object is derived from its field values, not object identity. Most value classes will want to overridetoString
to more legibly convey the domain value represented by the object.For a
Cloneable
value class, theObject.clone
method produces a value object that is indistinguishable from the original—the usual expectation thatx.clone() != x
is not meaningful for value objects. Value classes that store references to identity objects may wish to overrideclone
and perform a "deep copy" of these identity objects.The
wait
andnotify
methods require that the object be locked in the current thread; since it is impossible to synchronize on a value object, attempts to call these methods will always fail with anIllegalMonitorStateException
.The
finalize
method of a value object will never be invoked by the garbage collector.
A value class can implement any interface. When a variable has an interface type, that variable may store a value object or an identity object.
A value class cannot extend a concrete class (with the special exception of
Object
).
Abstract classes may be value classes or identity classes. The value
modifier
applied to an abstract class indicates that the class has no need for identity,
but does not restrict its subclasses. Just like an interface, an abstract value
class may be extended by both identity classes and value classes. An abstract
class without the value
modifier is an identity class, and may only be
extended by other identity classes.
Thus, a value class can extend an abstract value class, but cannot extend an abstract identity class.
Many abstract classes are good value class candidates. The class Number
, for
example, has no fields, nor any code that depends on identity-sensitive
features.
abstract value class Number implements Serializable {
public abstract int intValue();
public abstract long longValue();
public byte byteValue() { return (byte) intValue(); }
...
}
All of the usual rules for value classes apply to abstract value classes—for
example, any instance fields of an abstract value class are implicitly final
.
With preview features enabled, the JDK classes Number
and Record
are treated
as abstract value classes.
Constructors initialize newly-created objects, including setting the values of the objects' fields. Because value objects do not have identity, their initialization requires special care.
An object being constructed is "larval"—it has been created, but it is not yet fully-formed. Larval objects must be handled carefully, because the expected properties and invariants of the object may not yet hold—for example, the fields of a larval object may not be set. If a larval object is shared with outside code, that code may even observe the mutation of a final field!
Traditionally, a constructor begins the initialization process by invoking a
superclass constructor, super(...)
. After the superclass returns, the subclass
then proceeds to set its declared instance fields and perform other
initialization tasks. This pattern exposes a completely uninitialized subclass
to any larval object leakage occurring in a superclass constructor.
The Flexible Constructor Bodies feature enables an
alternative approach to initialization, in which fields can be set and other
code executed before the super(...)
invocation. There is a two-phase
initialization process: early construction before the super(...)
invocation,
and late construction afterwards.
During the early construction phase, larval object leakage is impossible: the
constructor may set the fields of the larval object, but may not invoke instance
methods or otherwise make use of this
. Fields that are initialized in the
early phase are set before they can ever be read, even if a superclass leaks the
larval object. Final fields, in particular, can never be observed to mutate.
In a value class, all constructor and initializer code normally occurs in the
early construction phase. This means that attempts to invoke instance methods or
otherwise use this
will fail:
value class Name {
String name;
int length;
Name(String n) {
name = n;
length = computeLength(); // error!
}
private int computeLength() {
return name.length();
}
}
Field that are declared with initializers get set at the start of the
constructor (as usual), but any implicit super()
call gets placed at the end
of the constructor body.
When a constructor includes code that needs to work with this
, an explicit
super(...)
or this(...)
call can be used to mark the transition to the late
phase. But all fields must be initialized before the super(...)
call, without
reference to this
:
value class Name {
String name;
int length;
Name(String n) {
name = n;
length = computeLength(name); // ok
super(); // all fields must be set at this point
System.out.println("Name: " + this);
}
// refactored to be static:
private static int computeLength(String n) {
return n.length();
}
}
For convenience, the early construction rules are relaxed to allow the class's
fields to be read as well as written—both references to the field name
in
the above constructor are legal. It continues to be illegal to refer to
inherited fields, invoke instance methods, or share this
with other code until
the late construction phase.
Instance initializer blocks (a rarely-used feature) continue to run in the late phase, and so may not assign to value class instance fields.
Note that these restrictions make it impossible for a value object to be created
with a field storing a reference back to the object. This is important for the
==
operation, which would otherwise risk getting stuck in an infinite loop.
(Of course, it is still possible to mutate a referenced identity object to store
a this
reference. But such cycles can only be created with the aid of an
identity object, and ==
does not recur on the contents of identity objects.)
This scheme is also appropriate for identity records, so this JEP modifies the
language rules for records such that their constructors always run in the early
construction phase. When access to this
is needed, an explicit super()
can
be inserted, but the record's fields must be set beforehand. The following
record declaration will fail to compile when preview features are enabled,
because it now makes reference to this
in the early construction phase.
record Node(String label, List<Node> edges) {
public Node {
validateNonNull(this, label); // error!
validateNonNull(this, edges); // error!
}
static void validateNonNull(Object o, Object val) {
if (val == null) {
throw new IllegalArgumentException(
"null arg for " + o);
}
}
}
(Note that this attempt to provide useful diagnostics by sharing this
is
misguided anyway: in a record's compact constructor, the fields are not set
until the end of the constructor body; before they are set, the toString
result will always be Node[label=null, edges=null]
.)
Finally, in normal identity classes, we think developers should write
constructors and initializers that avoid the risk of larval object leakage by
generally adopting the early construction constraints: read and write the
declared fields of the class, but otherwise avoid any dependency on this
, and
where a dependency is necessary, mark it as deliberate by putting it after an
explicit super(...)
or this(...)
call. To encourage this style, javac
provides lint
warnings indicating this
dependencies in normal identity
class constructors. (In the future, we anticipate that normal identity classes
will have a way to adopt the constructor timing of value classes and records. A
class that compiles without warning will likely be able to cleanly make that
transition.)
Traditional object deserialization in the JDK works by bypassing the constructor
of a Serializable
class and manually setting the class's fields instead. This
approach to object construction violates the safe construction constraints
described above, and is not compatible with value objects.
Fortunately, deserialization of Serializable
value classes is fully supported
whenever this traditional mechanism is not needed:
Value records are deserialized by invoking the canonical constructor directly
The value classes in the JDK, including the wrapper classes, are recognized by the serialization API and cleanly deserialized
Any value class whose serialization is handled by
writeReplace
andreadResolve
methods will be reconstructed through a standard constructor
For Serializable
classes that do not fall into any of these categories, it may
not be appropriate to migrate them to be value classes for now. In the future,
enhancements to the serialization mechanism are anticipated that will allow
value classes to be directly deserialized.
Value classes with private
fields are somewhat less encapsulated than identity
classes.
Someone who wants to determine a value object's field values might, for example,
guess at the field values, construct another object that wraps their guess, and
then compare the private fields of the two objects with ==
. Or they might call
System.identityHashCode
and try to reverse-engineer the field values from the
hash code. These techniques are not possible when trying to observe the private
contents of an identity object.
The following class, PIN
, may be putting a private credential at risk by being
declared as a value class.
public value class PIN {
private int val;
public PIN(int val) { this.val = val; }
public String toString() { return "****"; }
// no accessors
}
var pin = secureSystem.getPIN();
for (int i = 0; i < 10000; i++) {
var guess = new PIN(i);
if (pin == guess)
System.out.println("Cracked PIN: "+i);
}
When declaring a value class, it's important to keep these risks in mind. In some cases involving sensitive data, an identity class may be a better fit.
Migration of existing classesDevelopers are encouraged to identify and eventually migrate value class candidates in their own code. Records and other classes that represent domain values are potential candidates, along with interface-like abstract classes.
When an identity class is intended to become a value class in a future release, its authors should consider the following:
On migration, all instance fields of the class will implicitly be made
final
and will need to be initialized without any reference tothis
. If that presents difficulties, the class may not be be a good migration candidate. If there are any non-private
, non-final
fields, the change will need to be coordinated with any users who might attempt to mutate the fields.Similarly, a concrete, non-
final
class will becomefinal
on migration. If users have been allowed to both extend the class and create instances with constructors, the author must choose to either break subclasses (by addingfinal
), break instance creations (by addingabstract
along with, say, factory methods and a private implementation class), or conclude that the class is not a good migration candidate.The
equals
andhashCode
methods should be overridden by the class so that their results are consistent before and after migration.Users of the class will be able to observe different
==
behavior after migration. If this is a concern, an ideal migration candidate might declare private constructors and provide a factory method that explicitly advertises the possibility of results that are==
to a previous result. (See, for example, theInteger.valueOf
factory method.)Classes that encode large data structures with value object references (a linked list, for example) will have relatively slow
==
performance, and may not be good candidates for migration. Classes with an unusually large number of fields may, similarly, find that migrating to be a value class has a negative performance impact.Attempts to synchronize on instances or use the
java.lang.ref
API will fail after migration. Of course, the class itself should not declaresynchronized
methods or otherwise use these features. There's not much that can be done to prevent users from doing so, but it may be helpful to advertise the risk in the class's documentation.As addressed in earlier sections, classes that are
Serializable
or that encapsulate sensitive data may not be good migration candidates.If the superclass is not
Object
, it must be made a value class before this class can be migrated. All of the considerations in this section apply to the superclass.
Classes that have addressed these concerns can expect a smooth migration to becoming value classes, without any compatibility issues. All existing binaries will continue to link successfully. The only new compiler errors will be attempts to synchronize on the value class type.
JVM Features
This JEP includes two preview JVM features to support value classes:
The
ACC_SUPER
class modifier has been repurposed asACC_IDENTITY
. Identity classes set this flag; interfaces and value classes leave it unset.Any class that uses a value class type in one of its field or method descriptors should list that type in a new
LoadableDescriptors
class attribute. This attribute authorizes the JVM to load the named value classes early enough that it can optimize the layouts of references to instances from the class that contains the attribute.
Additionally, compiled value classes use the features of Strict Field Initialization in the JVM (Preview) to guarantee that the class's fields are properly initialized.
Alternatives
As discussed, JVMs have long performed escape analysis to identify objects that never rely on identity throughout their lifespan and can be scalarized. These optimizations are somewhat unpredictable, and do not help with objects that escape the scope of the optimization, including storage in fields and arrays.
Hand-coded optimizations via primitive values are possible to improve performance, but as noted in the "Motivation" section, these techniques require giving up valuable abstractions.
The C language and its relatives support flattened storage for struct
s and
similar class-like abstractions. For example, the C# language has
value types.
Unlike value objects, instances of these abstractions have identity, meaning
they support operations such as field mutation. As a result, the semantics of
copying on assignment, invocation, etc., must be carefully specified, leading to
a more complex user model and less flexibility for runtime implementations. We
prefer an approach that leaves these low-level details to the discretion of JVM
implementations.
Risks and Assumptions
The feature makes significant changes to the Java object model. Developers may
be surprised by, or encounter bugs due to, changes in the behavior of operations
such as ==
and synchronized
. We expect such disruptions to be rare and
tractable.
Some changes could potentially affect the performance of identity objects. The
if_acmpeq
test, for example, typically only costs one instruction
cycle, but will now need an additional check to detect value objects. But the
identity class case can be optimized as a fast path, and we believe we have
minimized any performance regressions.
There is a security risk that ==
and hashCode
can indirectly expose
private
field values. Further, two large trees of value objects can take
unbounded time to compute ==
, potentially a DoS attack risk. Developers need
to understand these risks.
Dependencies
Prerequisites:
In anticipation of this feature we already added warnings about potential behavioral incompatibilities for value class candidates in
javac
and HotSpot, via Warnings for Value-Based Classes and JDK-8354556Flexible Constructor Bodies allows constructors to execute statements before a
super(...)
call and allows assignments to instance fields in this context. These changes facilitate the construction protocol required by value classes.Strict Field Initialization in the JVM (Preview) provides the JVM mechanism necessary to require, through verification, that value class instance fields are initialized during early construction
Future work:
Null-Restricted Value Class Types (Preview) will build on this JEP, allowing programmers to manage the storage of nulls and enable more dense heap flattening in fields and arrays.
Enhanced Primitive Boxing (Preview) will enhance the language's use of primitive types, taking advantage of the lighter-weight characteristics of boxing to value objects.
JVM class and method specialization (JEP 218, with revisions) will allow generic classes and methods to specialize field, array, and local variable layouts when parameterized by value class types.
- relates to
-
JDK-8277163 Value Objects (Preview)
-
- Closed
-
-
JDK-8354832 Shallow equality of value objects
-
- New
-