-
Bug
-
Resolution: Delivered
-
P2
-
21.0.2
Running a containerized server application using eclipse-temurin:21.0.1_12-jdk on Google Kubernetes Engine, (16 core epyc-7003-series, 64 GB RAM).
Flags:
-XX:+AlwaysPreTouch -XX:ConcGCThreads=3 -XX:FlightRecorderOptions=stackdepth=256 -XX:G1ConcRefinementThreads=13 -XX:GCDrainStackTargetSize=64 -XX:InitialCodeCacheSize=272629760 -XX:InitialHeapSize=53951594496 -XX:InitialRAMPercentage=80.000000 -XX:MarkStackSize=4194304 -XX:MaxHeapFreeRatio=100 -XX:MaxHeapSize=53951594496 -XX:MaxRAM=67439493120 -XX:MaxRAMPercentage=80.000000 -XX:MinHeapSize=6815736 -XX:NewRatio=1 -XX:ObjectAlignmentInBytes=16 -XX:+PrintCommandLineFlags -XX:+PrintFlagsFinal -XX:ReservedCodeCacheSize=838860800 -XX:+SegmentedCodeCache -XX:+UseCompressedOops -XX:-UseContainerSupport -XX:+UseG1GC -XX:+UseTransparentHugePages
A DESCRIPTION OF THE PROBLEM :
Within 10 minutes of our application startup, we often encounter very long GC pause times in "Remark". These vary from 400ms to 4 seconds long.
When capturing one such pause in the flight recorder, it shows that almost all of the pause is spent in "Class Unloading".
This long pause tends to be repeated a few times but eventually settles when our application has been running for a while.
As an experiment, I disabled class unloading with "-XX:-ClassUnloading". This completely solved the long pause times, but we instead saw much longer young generation pauses (went from around 160 ms -> 350 ms). The reason for that is not clear to us, but it was notable that "code root" scanning was very unbalanced.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
We run our server under moderate load and encounter the issue within 10 minutes. Unfortunately there is no small test case to reproduce the issue.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Class unloading should not be taking a few seconds.
ACTUAL -
Remark phase that is less than 100 ms?
CUSTOMER SUBMITTED WORKAROUND :
-XX:-ClassUnloading does prevent the long pause in "remark" but seems to have a side effect of causing longer pause times in young generation collection ("evacuate collection set").
FREQUENCY : often
- relates to
-
JDK-8288936 Wrong lock ordering writing G1HeapRegionTypeChange JFR event
- Resolved
-
JDK-8318720 G1: Memory leak in G1CodeRootSet after JDK-8315503
- Resolved
-
JDK-8317600 VtableStubs::stub_containing() table load not ordered wrt to stores
- Resolved
-
JDK-8317440 Lock rank checking fails when code root set is modified with the Servicelock held after JDK-8315503
- Closed
-
JDK-8318109 Writing JFR records while a CHT has taken its lock asserts in rank checking
- Closed
-
JDK-8315503 G1: Code root scan causes long GC pauses due to imbalanced iteration
- Resolved
-
JDK-8315605 G1: Add number of nmethods in code roots scanning statistics
- Resolved
-
JDK-8315998 Remove dead ClassLoaderDataGraphKlassIteratorStatic
- Resolved
-
JDK-8316002 Remove unnecessary seen_dead_loader in ClassLoaderDataGraph::do_unloading
- Resolved
-
JDK-8316670 Remove effectively unused nmethodBucket::_count
- Resolved
-
JDK-8316959 Improve InlineCacheBuffer pending queue management
- Resolved
-
JDK-8317007 Add bulk removal of dead nmethods during class unloading
- Resolved
-
JDK-8317235 Remove Access API use in nmethod class
- Resolved
-
JDK-8317350 Move code cache purging out of CodeCache::UnloadingScope
- Resolved
-
JDK-8317677 Specialize Vtablestubs::entry_for() for VtableBlob
- Resolved
-
JDK-8317809 Insertion of free code blobs into code cache can be very slow during class unloading
- Resolved
-
JDK-8318585 Rename CodeCache::UnloadingScope to UnlinkingScope
- Resolved
-
JDK-8319955 Improve dependencies removal during class unloading
- Resolved
-
JDK-8316669 ImmutableOopMapSet destructor not called
- Resolved