-
Enhancement
-
Resolution: Fixed
-
P4
-
14
-
b07
Running G1 with JDK-8213108 on huge heaps shows unusually long unaccounted Merge Heap Roots times.
E.g.
[3322.457s][info ][gc,phases] GC(73) Merge Heap Roots: 15.4ms
[3322.457s][debug][gc,phases] GC(73) Remembered Sets (ms): Min: 0.1, Avg: 0.3, Max: 0.4, Diff: 0.3, Sum: 8.5, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Merged Sparse: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Merged Fine: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Merged Coarse: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Hot Card Cache (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Log Buffers (ms): Min: 2.2, Avg: 2.3, Max: 2.4, Diff: 0.2, Sum: 72.9, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Processed Buffers: Min: 18, Avg: 55.1, Max: 75, Diff: 57, Sum: 1762, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Dirty Cards: Min: 4608, Avg: 12754.0, Max: 16943, Diff: 12335, Sum: 408127, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Skipped Cards: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0, Workers: 32
The separate phases account for only a small part of total time taken. Investigate where the time is spent.
This is on a heap with 150GB @ 32M with 4800 regions.
Most likely the setup of some internal data structures (there is some 2.5MB of data to clear; probably it would be good to increase the G1RemSetScanState::CardsPerChunk constant for large heaps/region sizes; it would be doable because scanning indicates ~17k chunks total scanned) but also starting up/shutting down threads in the parallel part may be an issue (potentially too many threads started).
E.g.
[3322.457s][info ][gc,phases] GC(73) Merge Heap Roots: 15.4ms
[3322.457s][debug][gc,phases] GC(73) Remembered Sets (ms): Min: 0.1, Avg: 0.3, Max: 0.4, Diff: 0.3, Sum: 8.5, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Merged Sparse: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Merged Fine: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Merged Coarse: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Hot Card Cache (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Log Buffers (ms): Min: 2.2, Avg: 2.3, Max: 2.4, Diff: 0.2, Sum: 72.9, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Processed Buffers: Min: 18, Avg: 55.1, Max: 75, Diff: 57, Sum: 1762, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Dirty Cards: Min: 4608, Avg: 12754.0, Max: 16943, Diff: 12335, Sum: 408127, Workers: 32
[3322.457s][debug][gc,phases] GC(73) Skipped Cards: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0, Workers: 32
The separate phases account for only a small part of total time taken. Investigate where the time is spent.
This is on a heap with 150GB @ 32M with 4800 regions.
Most likely the setup of some internal data structures (there is some 2.5MB of data to clear; probably it would be good to increase the G1RemSetScanState::CardsPerChunk constant for large heaps/region sizes; it would be doable because scanning indicates ~17k chunks total scanned) but also starting up/shutting down threads in the parallel part may be an issue (potentially too many threads started).
- is blocked by
-
JDK-8227084 Add timing information for merge heap root preparation
- Resolved
-
JDK-8227089 Add timing information for merging humongous remembered sets
- Resolved
- relates to
-
JDK-8213108 Improve work distribution during remembered set scan
- Resolved