-
CSR
-
Resolution: Unresolved
-
P3
-
None
-
behavioral
-
low
-
-
Java API
-
SE
Summary
By removing the caching fields in AbstractMap
, we can make the immutable maps @ValueBased
, smaller and, at the same time, improve performance. In fact, all Map implementations based on AbstractMap
will be smaller.
Problem
As the current implementation of AbstractMap
declares fields that cache the Map's keySet and values views. This prevents subclasses from being @ValueBased
. Caching the views might have been a good idea around the turn of the millennium but these days, the JVM is oftentimes able to optimize away the creation of new objects that do not escape.
Solution
Instead of caching the views for potential reuse, new instances are created on each call to the keySet() and values() methods. This allows removal of cache fields:
transient Set<K> keySet;
transient Collection<V> values;
According to the previous specs, the methods were not guaranteed to return the same cached objects due to concurrency issues, so it would theoretically be okay to just remove them. However, a single-threaded application would observe the view objects returned always being identical.
Specification
diff --git a/src/java.base/share/classes/java/util/AbstractMap.java b/src/java.base/share/classes/java/util/AbstractMap.java
index c5ea53e17db8..52fb5af15889 100644
--- a/src/java.base/share/classes/java/util/AbstractMap.java
+++ b/src/java.base/share/classes/java/util/AbstractMap.java
@@ -301,124 +301,48 @@ public void clear() {
entrySet().clear();
}
-
- // Views
-
- /**
- * Each of these fields are initialized to contain an instance of the
- * appropriate view the first time this view is requested. The views are
- * stateless, so there's no reason to create more than one of each.
- *
- * <p>Since there is no synchronization performed while accessing these fields,
- * it is expected that java.util.Map view classes using these fields have
- * no non-final fields (or any fields at all except for outer-this). Adhering
- * to this rule would make the races on these fields benign.
- *
- * <p>It is also imperative that implementations read the field only once,
- * as in:
- *
- * <pre> {@code
- * public Set<K> keySet() {
- * Set<K> ks = keySet; // single racy read
- * if (ks == null) {
- * ks = new KeySet();
- * keySet = ks;
- * }
- * return ks;
- * }
- *}</pre>
- */
- transient Set<K> keySet;
- transient Collection<V> values;
-
/**
* {@inheritDoc}
*
* @implSpec
- * This implementation returns a set that subclasses {@link AbstractSet}.
+ * This implementation returns a fresh set that subclasses {@link AbstractSet}.
* The subclass's iterator method returns a "wrapper object" over this
* map's {@code entrySet()} iterator. The {@code size} method
* delegates to this map's {@code size} method and the
* {@code contains} method delegates to this map's
* {@code containsKey} method.
*
- * <p>The set is created the first time this method is called,
- * and returned in response to all subsequent calls. No synchronization
- * is performed, so there is a slight chance that multiple calls to this
- * method will not all return the same set.
*/
public Set<K> keySet() {
/**
* {@inheritDoc}
*
* @implSpec
- * This implementation returns a collection that subclasses {@link
+ * This implementation returns a fresh collection that subclasses {@link
* AbstractCollection}. The subclass's iterator method returns a
* "wrapper object" over this map's {@code entrySet()} iterator.
* The {@code size} method delegates to this map's {@code size}
* method and the {@code contains} method delegates to this map's
* {@code containsValue} method.
*
- * <p>The collection is created the first time this method is called, and
- * returned in response to all subsequent calls. No synchronization is
- * performed, so there is a slight chance that multiple calls to this
- * method will not all return the same collection.
*/
public Collection<V> values() {
PR: https://github.com/openjdk/jdk/pull/15614/files
Discussion
Compatibility
AbstractMap
is designed to be subclassed. The caching fields are package private and as such they are not accessible from subclasses outside the JDK. Furthermore, the fields are transient
which means the serialized form is not affected by the proposed change.
Functionally, newly created objects are equivalent to the caching ones but callers that rely on ==
, indentyHashCode()
, synchronization, WeakReference
will be affected. However, it seems unreasonable for code to rely on such things about an object they don't have direct control over. The JDK has 517 direct usages of keySet()
and 445 direct usages of values()
; none of which is used for identity comparison. Most uses of the map views in the corpus seem to be transient uses that don't rely on the identity of the view objects. See below for examples and discussion.
The wrapper classes in the Collections
class are unaffected by the proposed change. The synchronized wrappers always synchronize on the wrapper itself and not on the backing collection, so they are also unaffected by the proposed change.
JDK subclasses EnumMap
, HashMap
, IdentityHashMap
, LinkedHashMap
, TreeMap
, ConsurrentHashMap
, ConcurrentSkipListMap
and WeakHashMap
also cached map views and in some cases reused the fields from AbstractMap
. However, the caching behavior was not specified. The view caches have been removed from these classes as well. Clients of these classes can observe similar behavior changes regarding the identity of the map views. For similar reasons, we do not believe this to be a major compatibility issue.
Performance
Initial performance benchmarks indicate 0 - 30% better performance and reduced memory pressure as cached views end up tenured, whereas freshly created view usually dies in young gen or allocation is elided entirely by escape analysis.
Overall, every Map
instance will be made smaller by the removal of two fields; this typically saves 8 bytes. On top of this, many maps will have to construct actual values for the fields as soon as the Map
is used and these objects will occupy some additional 16 bytes (Set
) and 16 bytes (Collection
) (typical values). The percentage space reduction of immutable maps (which are common) is especially noteworthy.
Removing the caching of these fields introduces a moderate risk of incompatibilities as not only the class AbstractMap
is affected but also 60 other classes in the JDK that directly inherit from it and transitively more classes.
Usage
A corpus analysis of direct use of java.util.AbstractMap
indicates extensive use:
wc -l experiment.html
534562 experiment.html
As the file contains two lines per found entry, this means there are more than 250,000 usages of AbstractMap
including prominent libraries like Guava. However, looking at a larger number of sample uses, these fall into three main categories:
- Transient use
- Long-term use
- Use based on identity
Transient use appears to be by far the most common use and here we see patterns like iteration, size, creating an array, and copying into another collection. Long-term use seems uncommon, and we have not been able to identify a single use based on identity in the samples. This does not rule out that there are, indeed, such use cases; but if they do occur, they would seem to be quite rare.
Here is some sample usage:
jo.keySet().toArray(new String[jo.length()])
map.keySet().iterator()
new ArrayList<>(nameValuePairs.keySet())
set.equals(((JSONObject)other).keySet())
public Collection<String> getFieldNames() {
return _fields.keySet();
}
for (K key : m.keySet()) {
// do stuff
}
public String toString() {
return "<State:" + stateType + ":" + table.keySet() + ">";
}
createLinkedHashSet(parameterKeys.values())
- csr of
-
JDK-8316493 Remove the caching fields in AbstractMap
-
- In Progress
-