-
Bug
-
Resolution: Fixed
-
P4
-
9
-
b115
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8183633 | 8u161 | Mikael Gerdin | P4 | Resolved | Fixed | b01 |
JDK-8165820 | 8u152 | Shafi Ahmad | P4 | Resolved | Fixed | b01 |
JDK-8192199 | emb-8u161 | Unassigned | P4 | Resolved | Fixed | b01 |
JDK-8165821 | 7u131 | Shafi Ahmad | P4 | Resolved | Fixed | b01 |
Symptom: sporadic long marking phases (up to n times longer, where n is the number of HW threads) with parallel old GC, if the object graph has many linear subgraphs.
Analysis: ParallelTaskTerminator::offer_termination() does not discover objects to scan on the marking stacks of co-worker threads, because it looks at the wrong TaskQueueSet. So instead of participating in the work, a gc thread that reaches offer_termination() will do SpinPause until the marking has been finished - potentially by just one co-worker thread.
Fix: let ParallelTaskTerminator check ParCompactionManager::stack_array() instead of ParCompactionManager::region_array().
A specialized microbenchmark, TestStealMarkingTask (source attached and results below), shows about 3x shorter parallel old full GCs with the patch.
Benchmark run without patch
---------------------------
$ jdk9/bin/java -Xms2g -Xmx2g -XX:+UseParallelGC -cp /usr/work/d038402/jtests/QuickJavaTests/bin test.gc.TestStealMarkingTask 10 10000 10000 2
Building list of length 10000 holding sub-lists of length 10000
Heap stats: free:1921 MB (97.9 %) max:1963 MB
done
Heap stats: free:808 MB (41.2 %) max:1963 MB
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3359 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3087 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3024 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3535 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 2831 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3313 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 4919 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3624 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 2689 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3418 ms
Benchmark run without patch: about 3x shorter full GC pauses
------------------------------------------------------------
$ jdk9_patched/bin/java -Xms2g -Xmx2g -XX:+UseParallelGC -cp /usr/work/d038402/jtests/QuickJavaTests/bin test.gc.TestStealMarkingTask 10 10000 10000 2
Building list of length 10000 holding sub-lists of length 10000
Heap stats: free:1921 MB (97.9 %) max:1963 MB
done
Heap stats: free:808 MB (41.2 %) max:1963 MB
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1536 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1255 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1351 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1302 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1281 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1259 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1257 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1292 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1305 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1282 ms
Analysis: ParallelTaskTerminator::offer_termination() does not discover objects to scan on the marking stacks of co-worker threads, because it looks at the wrong TaskQueueSet. So instead of participating in the work, a gc thread that reaches offer_termination() will do SpinPause until the marking has been finished - potentially by just one co-worker thread.
Fix: let ParallelTaskTerminator check ParCompactionManager::stack_array() instead of ParCompactionManager::region_array().
A specialized microbenchmark, TestStealMarkingTask (source attached and results below), shows about 3x shorter parallel old full GCs with the patch.
Benchmark run without patch
---------------------------
$ jdk9/bin/java -Xms2g -Xmx2g -XX:+UseParallelGC -cp /usr/work/d038402/jtests/QuickJavaTests/bin test.gc.TestStealMarkingTask 10 10000 10000 2
Building list of length 10000 holding sub-lists of length 10000
Heap stats: free:1921 MB (97.9 %) max:1963 MB
done
Heap stats: free:808 MB (41.2 %) max:1963 MB
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3359 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3087 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3024 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3535 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 2831 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3313 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 4919 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3624 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 2689 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 3418 ms
Benchmark run without patch: about 3x shorter full GC pauses
------------------------------------------------------------
$ jdk9_patched/bin/java -Xms2g -Xmx2g -XX:+UseParallelGC -cp /usr/work/d038402/jtests/QuickJavaTests/bin test.gc.TestStealMarkingTask 10 10000 10000 2
Building list of length 10000 holding sub-lists of length 10000
Heap stats: free:1921 MB (97.9 %) max:1963 MB
done
Heap stats: free:808 MB (41.2 %) max:1963 MB
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1536 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1255 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1351 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1302 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1281 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1259 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1257 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1292 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1305 ms
Calling System.gc()
Heap stats: free:817 MB (41.6 %) max:1963 MB
Full GC duration: 1282 ms
- backported by
-
JDK-8165820 Long pause in ParOldGC, because ParallelTaskTerminator peeks wrong TaskQueueSet
-
- Resolved
-
-
JDK-8165821 Long pause in ParOldGC, because ParallelTaskTerminator peeks wrong TaskQueueSet
-
- Resolved
-
-
JDK-8183633 Long pause in ParOldGC, because ParallelTaskTerminator peeks wrong TaskQueueSet
-
- Resolved
-
-
JDK-8192199 Long pause in ParOldGC, because ParallelTaskTerminator peeks wrong TaskQueueSet
-
- Resolved
-