Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8316583

Remove the caching fields in AbstractMap

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Unresolved
    • Icon: P3 P3
    • 25
    • core-libs
    • None
    • behavioral
    • low
    • Hide
      Operations involving identity-sensitive operations on derived map objects may fail (not including immutable maps which are already specified @ValueBased).

      This includes operations like `==`, `indentyHashCode()`, synchronization and `WeakReference`.
      Show
      Operations involving identity-sensitive operations on derived map objects may fail (not including immutable maps which are already specified @ValueBased). This includes operations like `==`, `indentyHashCode()`, synchronization and `WeakReference`.
    • Java API
    • SE

      Summary

      By removing the caching fields in AbstractMap, we can make the immutable maps @ValueBased, smaller and, at the same time, improve performance. In fact, all Map implementations based on AbstractMap will be smaller.

      Problem

      As the current implementation of AbstractMap declares fields that cache the Map's keySet and values views. This prevents subclasses from being @ValueBased. Caching the views might have been a good idea around the turn of the millennium but these days, the JVM is oftentimes able to optimize away the creation of new objects that do not escape.

      Solution

      Instead of caching the views for potential reuse, new instances are created on each call to the keySet() and values() methods. This allows removal of cache fields:

      • transient Set<K> keySet;
      • transient Collection<V> values;

      According to the previous specs, the methods were not guaranteed to return the same cached objects due to concurrency issues, so it would theoretically be okay to just remove them. However, a single-threaded application would observe the view objects returned always being identical.

      Specification

      diff --git a/src/java.base/share/classes/java/util/AbstractMap.java b/src/java.base/share/classes/java/util/AbstractMap.java
      index c5ea53e17db8..52fb5af15889 100644
      --- a/src/java.base/share/classes/java/util/AbstractMap.java
      +++ b/src/java.base/share/classes/java/util/AbstractMap.java
      @@ -301,124 +301,48 @@ public void clear() {
                entrySet().clear();
           }
      
      -
      -    // Views
      -
      -    /**
      -     * Each of these fields are initialized to contain an instance of the
      -     * appropriate view the first time this view is requested.  The views are
      -     * stateless, so there's no reason to create more than one of each.
      -     *
      -     * <p>Since there is no synchronization performed while accessing these fields,
      -     * it is expected that java.util.Map view classes using these fields have
      -     * no non-final fields (or any fields at all except for outer-this). Adhering
      -     * to this rule would make the races on these fields benign.
      -     *
      -     * <p>It is also imperative that implementations read the field only once,
      -     * as in:
      -     *
      -     * <pre> {@code
      -     * public Set<K> keySet() {
      -     *   Set<K> ks = keySet;  // single racy read
      -     *   if (ks == null) {
      -     *     ks = new KeySet();
      -     *     keySet = ks;
      -     *   }
      -     *   return ks;
      -     * }
      -     *}</pre>
      -     */
      -    transient Set<K>        keySet;
      -    transient Collection<V> values;
      -
           /**
            * {@inheritDoc}
            *
            * @implSpec
       -     * This implementation returns a set that subclasses {@link AbstractSet}.
       +     * This implementation returns a fresh set that subclasses {@link AbstractSet}.
            * The subclass's iterator method returns a "wrapper object" over this
            * map's {@code entrySet()} iterator.  The {@code size} method
            * delegates to this map's {@code size} method and the
            * {@code contains} method delegates to this map's
            * {@code containsKey} method.
            *
      -     * <p>The set is created the first time this method is called,
      -     * and returned in response to all subsequent calls.  No synchronization
      -     * is performed, so there is a slight chance that multiple calls to this
      -     * method will not all return the same set.
            */
           public Set<K> keySet() {
      
      
          /**
            * {@inheritDoc}
            *
            * @implSpec
      -     * This implementation returns a collection that subclasses {@link
      +     * This implementation returns a fresh collection that subclasses {@link
            * AbstractCollection}.  The subclass's iterator method returns a
            * "wrapper object" over this map's {@code entrySet()} iterator.
            * The {@code size} method delegates to this map's {@code size}
            * method and the {@code contains} method delegates to this map's
            * {@code containsValue} method.
            *
      -     * <p>The collection is created the first time this method is called, and
      -     * returned in response to all subsequent calls.  No synchronization is
      -     * performed, so there is a slight chance that multiple calls to this
      -     * method will not all return the same collection.
            */
           public Collection<V> values() {

      PR: https://github.com/openjdk/jdk/pull/15614/files

      Discussion

      Compatibility

      AbstractMap is designed to be subclassed. The caching fields are package private and as such they are not accessible from subclasses outside the JDK. Furthermore, the fields are transient which means the serialized form is not affected by the proposed change.

      Functionally, newly created objects are equivalent to the caching ones but callers that rely on ==, indentyHashCode(), synchronization, WeakReference will be affected. However, it seems unreasonable for code to rely on such things about an object they don't have direct control over. The JDK has 517 direct usages of keySet() and 445 direct usages of values(); none of which is used for identity comparison. Most uses of the map views in the corpus seem to be transient uses that don't rely on the identity of the view objects. See below for examples and discussion.

      The wrapper classes in the Collections class are unaffected by the proposed change. The synchronized wrappers always synchronize on the wrapper itself and not on the backing collection, so they are also unaffected by the proposed change.

      JDK subclasses EnumMap, HashMap, IdentityHashMap, LinkedHashMap, TreeMap, ConsurrentHashMap, ConcurrentSkipListMap and WeakHashMap also cached map views and in some cases reused the fields from AbstractMap. However, the caching behavior was not specified. The view caches have been removed from these classes as well. Clients of these classes can observe similar behavior changes regarding the identity of the map views. For similar reasons, we do not believe this to be a major compatibility issue.

      Performance

      Initial performance benchmarks indicate 0 - 30% better performance and reduced memory pressure as cached views end up tenured, whereas freshly created view usually dies in young gen or allocation is elided entirely by escape analysis.

      Overall, every Map instance will be made smaller by the removal of two fields; this typically saves 8 bytes. On top of this, many maps will have to construct actual values for the fields as soon as the Map is used and these objects will occupy some additional 16 bytes (Set) and 16 bytes (Collection) (typical values). The percentage space reduction of immutable maps (which are common) is especially noteworthy.

      Removing the caching of these fields introduces a moderate risk of incompatibilities as not only the class AbstractMap is affected but also 60 other classes in the JDK that directly inherit from it and transitively more classes.

      Usage

      A corpus analysis of direct use of java.util.AbstractMap indicates extensive use:

      wc -l experiment.html 
          534562 experiment.html

      As the file contains two lines per found entry, this means there are more than 250,000 usages of AbstractMap including prominent libraries like Guava. However, looking at a larger number of sample uses, these fall into three main categories:

      • Transient use
      • Long-term use
      • Use based on identity

      Transient use appears to be by far the most common use and here we see patterns like iteration, size, creating an array, and copying into another collection. Long-term use seems uncommon, and we have not been able to identify a single use based on identity in the samples. This does not rule out that there are, indeed, such use cases; but if they do occur, they would seem to be quite rare.

      Here is some sample usage:

      jo.keySet().toArray(new String[jo.length()])
      
      map.keySet().iterator()
      
      new ArrayList<>(nameValuePairs.keySet())
      
      set.equals(((JSONObject)other).keySet())
      
      public Collection<String> getFieldNames() {
          return _fields.keySet();
      }
      
      for (K key : m.keySet()) {
        // do stuff
      }
      
      public String toString() {
          return "<State:" + stateType + ":" + table.keySet() + ">";
      }
      
      createLinkedHashSet(parameterKeys.values())

            pminborg Per-Ake Minborg
            pminborg Per-Ake Minborg
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: