-
Enhancement
-
Resolution: Unresolved
-
P4
-
26
In most cases, an explicit loop with single-element operations is substantially faster than bulk methods that process multiple elements. For example:
new HashMap<>(myMonomorphicMap)
is up to 57% slower than:
Map<K, V> myNewMap = new HashMap<>((int)(myMonomorphicMap.size() * 1.35));
for (Entry<K, V> entry : myMonomorphicMap.entrySet()) {
myNewMap.put(entry.getKey(), entry.getValue());
}
Evidence:
JMH benchmarks on both aarch64 (AWS c6g) and x64 (AWS c6a) demonstrate significant performance differences:
• **aarch64**: Manual inlining shows 24-94% performance improvement over HashMap constructor
• **x64**: Manual inlining shows 65-136% performance improvement over HashMap constructor
• **Memory efficiency**: Manual approach uses 1-7% less memory per operation for non-empty maps
• **Edge case**: Empty maps show 10-23% regression with manual approach
Performance gains scale with map size, with the largest improvements occurring for maps with 100+ elements.
The attached JMH test HashMapConstructorBenchmark.java demonstrates the effect on HashMap.<init>(Map) and the JMH test CollectionPolymorphismBenchmark.java demonstrates the effect on a wider selection of similar methods.
Explanation:
HashMap.<init>(Map) is inherently megamorphic (HashMap, TreeMap, LinkedHashMap, Collections.SingletonMap, etc.), meaning locally monomorphic usage still suffers from global megamorphism. The JIT cannot eliminate the virtual method lookups in these unusual concentrations of virtual methods: Map.entrySet(), Set.iterator(), Iterator.hasNext(), Iterator.next(), Entry.getKey(), Entry.getValue(). Put another way, (3 + 4n) virtual method calls are required to iterate across input Maps. List iteration is somewhat better with (2 + 2n).
Impact:
The microbenchmarks show performance improvements of 24-136% for the manual approach depending on architecture and map size. Separate analysis of real-world application hotspots shows performance gains of 16-75% when these bulk methods are manually rewritten in application code.
Scope:
Profiling analysis of 8 known but unsolved application hotspots shows that 7 included these methods, including one in Tomcat
(https://bz.apache.org/bugzilla/show_bug.cgi?id=69820) and one in Log4j (https://github.com/apache/logging-log4j2/issues/3935). Additional slow methods have been found in other custom data structures, such as Guava or proprietary libraries.
Solution:
No great idea, but here are some conversation starters:
1. Manually identify and rewrite all hotspots. A search of our profiling data for several applications shows that this problem is in our libraries
more often than our application, so the surface area and cost of fixing are quite high.
2. Modify javac to rewrite calls to known methods, similar to how String concatenation is under-the-covers magic. This implicitly increases compiled
code size, violates spec, and requires all libraries to be recompiled and published.
3. Develop JIT techniques to create specialized versions of bulk methods for common receiver type combinations, reducing virtual call overhead.
new HashMap<>(myMonomorphicMap)
is up to 57% slower than:
Map<K, V> myNewMap = new HashMap<>((int)(myMonomorphicMap.size() * 1.35));
for (Entry<K, V> entry : myMonomorphicMap.entrySet()) {
myNewMap.put(entry.getKey(), entry.getValue());
}
Evidence:
JMH benchmarks on both aarch64 (AWS c6g) and x64 (AWS c6a) demonstrate significant performance differences:
• **aarch64**: Manual inlining shows 24-94% performance improvement over HashMap constructor
• **x64**: Manual inlining shows 65-136% performance improvement over HashMap constructor
• **Memory efficiency**: Manual approach uses 1-7% less memory per operation for non-empty maps
• **Edge case**: Empty maps show 10-23% regression with manual approach
Performance gains scale with map size, with the largest improvements occurring for maps with 100+ elements.
The attached JMH test HashMapConstructorBenchmark.java demonstrates the effect on HashMap.<init>(Map) and the JMH test CollectionPolymorphismBenchmark.java demonstrates the effect on a wider selection of similar methods.
Explanation:
HashMap.<init>(Map) is inherently megamorphic (HashMap, TreeMap, LinkedHashMap, Collections.SingletonMap, etc.), meaning locally monomorphic usage still suffers from global megamorphism. The JIT cannot eliminate the virtual method lookups in these unusual concentrations of virtual methods: Map.entrySet(), Set.iterator(), Iterator.hasNext(), Iterator.next(), Entry.getKey(), Entry.getValue(). Put another way, (3 + 4n) virtual method calls are required to iterate across input Maps. List iteration is somewhat better with (2 + 2n).
Impact:
The microbenchmarks show performance improvements of 24-136% for the manual approach depending on architecture and map size. Separate analysis of real-world application hotspots shows performance gains of 16-75% when these bulk methods are manually rewritten in application code.
Scope:
Profiling analysis of 8 known but unsolved application hotspots shows that 7 included these methods, including one in Tomcat
(https://bz.apache.org/bugzilla/show_bug.cgi?id=69820) and one in Log4j (https://github.com/apache/logging-log4j2/issues/3935). Additional slow methods have been found in other custom data structures, such as Guava or proprietary libraries.
Solution:
No great idea, but here are some conversation starters:
1. Manually identify and rewrite all hotspots. A search of our profiling data for several applications shows that this problem is in our libraries
more often than our application, so the surface area and cost of fixing are quite high.
2. Modify javac to rewrite calls to known methods, similar to how String concatenation is under-the-covers magic. This implicitly increases compiled
code size, violates spec, and requires all libraries to be recompiled and published.
3. Develop JIT techniques to create specialized versions of bulk methods for common receiver type combinations, reducing virtual call overhead.
- relates to
-
JDK-8015416 tier one should collect context-dependent split profiles
-
- Open
-
-
JDK-8015417 profile pollution after call through invokestatic to shared code
-
- Open
-
- links to