Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: P3
Fix Version/s: 24
Affects Version/s: 17, 21, 24
Component/s: hotspot
Labels:
- amazon-interest
- gc-shenandoah

Subcomponent:
gc
Resolved In Build:
b06

Shenandoah is designed to be "pause-less", and GC is supposed to run concurrently with mutator threads. If properly configured for a particular workload, Shenandoah will not experience OOM. When it does experience OOM, that suggests that the JVM has not been properly configured for this workload and concurrent operation of GC is not feasible.

As currently implemented, Shenandoah sometimes takes much too long retrying failed allocation requests, during which mutator threads are blocked waiting for multiple urgent stop-the-world (degenerated or full) GC cycles to complete. In these situations, fast failure would probably be a preferred behavior so that workload can be redistributed to other services and this particular JVM can be restarted in a better state.

Another problem that has been observed is that Shenandoah, as currently implemented, sometimes throws OOM when it would not be appropriate. The following failure has been observed with the gc/shenandoah/oom/TestThreadFailure.java jtreg test:

1. For NastyThread-0 through NastyThread-11, we perform a FullGC (which has good progress) but the good progress is not enough to satisfy the failed allocation request so we throw OOM.

2. With NastyThread-12, we do not fail fast. GC(127) is concurrent young. GC(128) through GC(132) are Full GCs, each with Bad Progress, but each yielding enough free memory to satisfy at least one additional allocation by NastyThread-12.

3. GC(133) is a full GC also with bad progress. This time, the bad progress is not enough to satisfy the pending alloc request (for 4112 bytes), so we throw OOM.

4. At this point, we have experienced 5 (Default value of ShenandoaohNoProgressThreshold) consecutive full GCs with no progress, so when the main thread attempts to allocate NastyThread-13 after joining with NastyThread-12, it does not even bother to attempt a Full GC. It just immediately throws OOM.

5. This causes the test to fail, because main is not "supposed" to experience OOM.

Exception in thread thread_name: java.lang.OutOfMemoryError: GC Overhead limit exceeded
Cause: The detail message "GC overhead limit exceeded" indicates that the garbage collector is running all the time and Java program is making very slow progress. After a garbage collection, if the Java process is spending more than approximately 98% of its time doing garbage collection and if it is recovering less than 2% of the heap and has been doing so far the last 5 (compile time constant) consecutive garbage collections, then a java.lang.OutOfMemoryError is thrown. This exception is typically thrown because the amount of live data barely fits into the Java heap having little free space for new allocations.

duplicates

JDK-8231397 [Redo] Shenandoah: GC retries are too aggressive for tests that expect OOME

Closed

links to

Commit openjdk/jdk/3a87eb5c

Commit(master) openjdk/shenandoah-jdk21u/2cc704b1

Review openjdk/jdk/19912

Review(master) openjdk/shenandoah-jdk21u/69

Assignee:: Kelvin Nilsen

Reporter:: Kelvin Nilsen

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2024-06-25 17:18

Updated:: 2024-08-01 13:24

Resolved:: 2024-07-08 11:06

Details

Description

Attachments

Issue Links

Activity

People

Dates