-
Enhancement
-
Resolution: Won't Fix
-
P3
-
8, 9, 10
JDK version | LiveNodeCountInliningCutoff | RSS (MB) | Total runtime | Compilation time | Application time
================================================================
7u80 | 20'000 (default) | 163 | 60s | 11s | 49s
8u60 | 20'000 | 522 | 166s | 127s | 39s
8u60 | 40'000 (default) | 976 | 414ss | 371s | 43s
The measurement was executed on a Linux x86_64 machine with -Xbatch and a maximum heap size of 100 MB. The LiveNodeCountInliningCutoff column illustrates the value of the flag with the same name. As suggested by the numbers, the VM's memory usage increases by around 3.2X from 7u80 to 8u60. The most likely reason for the increase is that the Nashorn JavaScript engine is used by default in 8u60. The VM's memory usage increases further (by around 1.9X) when the 8u60 VM is executed with the default value for the LiveNodeCountInliningCutoff flag. The flag's value has been increased from 20'000 to 40'000 by
In total, the VM's memory usage for the application considered increases by 6X from 7u80 to 8u60. JDK9 is similar to JDK8 and is also affected by this problem.
A number of issues have targeted reducing the VM's memory usage (
The goal of this enhancement is to further reduce the memory usage of the compiler. This issue is supposed to investigate three ways the compiler's memory usage can be reduced.
(1) Change arrays directly addressed with node IDs (the _idx field of every compiler node) to use hash tables instead. This change should target arrays with a high impact on the compiler's memory usage.
(2) For compilations with a large number of nodes, introduce and additional chunk size (in addition to the existing sizes tiny, init, medium, size, non_pool_size). The new chunk size should be larger than the existing chunk sizes and should allow the reuse of large memory chunks that are currently allocated with the operating system's memory allocator.
(3) Incremental (or post-parse) inlining in C2 produces lots of dead nodes (observed on Octane/Nashorn). Multiple PhaseRenumberLive passes during incremental inlining can help further reduce peak memory usage in that scenario. Since the pass can be expensive, it can be triggered when the gap between unique and live node counts becomes too large and performed with PhaseIdealLoop (see Compile::inline_incrementally).
(4) PhaseRemoveUseless and PhaseIterGVN are performed too frequently (that problem is targeted by
Here are some notes related to (1):
Code locations that use directly-referenced arrays:
- PhaseIdealLoop::Dominators -- allocates dfsorder and ntarjan arrays of size unique();
- PhaseIdealLoop::dom_depth and PhaseIdealLoop::_idom -- proportional to unique();
- PhaseCFG::global_code_motion -- recalc_pressure_nodes -- could be large, but size not necessarily proportional to unique();
- PhaseChaitin::stretch_base_pointer_live_ranges -- derived_base_map is allocated with malloc, size proportional to unique();
- PhaseIdealLoop::_preorders -- size proportional to unique();
- Compile::_node_bundling_base
- PhaseRegAlloc::_node_regs -- size proportional to unique();
- Scheduling::_node_bundling_base, _node_latency, _uses, _current_latency -- size most likely proportional to unique();
- Compile::fill_buffer -- allocates node_offsets array of size unique(), used only in fastdebug.
Data structures that use directly-referenced arrays:
- GrowableArray -- example usages ConnectionGraph::nodes, DepGraph::_map, Compile::_node_note_array, LiveRangeMap::_names, LiveRangeMap::_uf_map, PhaseCFG::_node_latency
- Node_Array -- example usages ConnectionGraph::_node_map, Matcher::_old2new_map (only debug), Matcher::_new2old_map (only debug), PhaseTransform::_nodes, Type_Array::_types
- Node_List -- example usages: Invariance::_old_new, PhaseCFG::schedule_local, Scheduling::_scheduled, Scheduling::_available
- Block_Array -- used in PhaseCFG::_node_to_block_mapping
- VectorSet -- uses _idx for checks -- already compressed but it could be maybe further optimized.
- relates to
-
JDK-8014959 assert(Compile::current()->live_nodes() < (uint)MaxNodeLimit) failed: Live Node limit exceeded limit
- Closed
-
JDK-8163999 Workaround intermittent failures of TreePosTest.java due to C2 memory usage
- Closed
-
JDK-8059241 C2: Excessive RemoveUseless passes during incremental inlining
- Resolved
-
JDK-8129847 Compiling methods generated by Nashorn triggers high memory usage in C2
- Resolved
-
JDK-8058148 MaxNodeLimit and LiveNodeCountInliningCutoff should be increased
- Closed
-
JDK-8165193 Workaround intermittent failures of JavacTreeScannerTest and SourceTreeScannerTest due to C2 memory usage
- Closed
-
JDK-8011858 Use Compile::live_nodes() instead of Compile::unique() in appropriate places
- Resolved
-
JDK-8137160 Use Compile::live_nodes instead of Compile::unique() in appropriate places -- followup
- Resolved