ADDITIONAL SYSTEM INFORMATION :
17.0.13
A DESCRIPTION OF THE PROBLEM :
We run a JVM with -XX:+HeapDumpOnOutOfMemoryError -XX:+CrashOnOutOfMemoryError.
The intention is to stop the application and generate a heap dump, immediately when the OOME is detected.
However, what we notice is that the process of dumping the heap in some cases takes insane amount of time. The reason is that during the heap dump, the GC pauses are still running are being triggered.
My suggestion is to stop the GC threads, when the combination of -XX:+HeapDumpOnOutOfMemoryError -XX:+CrashOnOutOfMemoryError startup parameters is detected. CrashOnOutOfMemoryError basically means that the user doesn't want the JVM to perform any action after OOME.
To see the issue in the logs, run the below program with -XX:+HeapDumpOnOutOfMemoryError -XX:+CrashOnOutOfMemoryError -Xmx3G -verbose:gc
public class OutOfMemoryErrorTest {
public static void main(String[] args) {
// trigger OOM
var list = new LinkedList<String>();
while (true) {
list.add("aaa");
if (list.size() % 1_000_000 == 0) {
System.out.println("List size: " + list.size());
}
}
}
}
You will see logs similar to this.
What is interesting is that between the first occurence of OutOfMemoryError and the file completing, JVM is logging a lot of GC pauses.
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid4489.hprof ...
[85.541s][info][gc] GC(111) Pause Young (Concurrent Start) (G1 Evacuation Pause) 3068M->3068M(3072M) 2.225ms
[85.541s][info][gc] GC(113) Concurrent Mark Cycle
[89.094s][info][gc] GC(112) Pause Full (G1 Compaction Pause) 3068M->3068M(3072M) 3552.943ms
[92.866s][info][gc] GC(114) Pause Full (G1 Compaction Pause) 3068M->3068M(3072M) 3771.683ms
[92.866s][info][gc] GC(113) Concurrent Mark Cycle 7324.984ms
[92.869s][info][gc] GC(115) Pause Young (Normal) (G1 Evacuation Pause) 3068M->3068M(3072M) 2.843ms
[97.040s][info][gc] GC(116) Pause Full (G1 Compaction Pause) 3068M->3068M(3072M) 4171.539ms
[101.603s][info][gc] GC(117) Pause Full (G1 Compaction Pause) 3068M->3068M(3072M) 4562.222ms
Heap dump file created [6543452335 bytes in 138.619 secs]
Aborting due to java.lang.OutOfMemoryError: Java heap space
I think this is because, in this code (debug.cpp), the system doesn't stop the GC on the first detection of OOME.
if (Atomic::cmpxchg(&out_of_memory_reported, 0, 1) == 0) {
// create heap dump before OnOutOfMemoryError commands are executed
if (HeapDumpOnOutOfMemoryError) {
tty->print_cr("java.lang.OutOfMemoryError: %s", message);
HeapDumper::dump_heap_from_oome();
}
if (OnOutOfMemoryError && OnOutOfMemoryError[0]) {
VMError::report_java_out_of_memory(message);
}
if (CrashOnOutOfMemoryError) {
tty->print_cr("Aborting due to java.lang.OutOfMemoryError: %s", message);
report_fatal(OOM_JAVA_HEAP_FATAL, __FILE__, __LINE__, "OutOfMemory encountered: %s", message);
}
This could be handled better by checking CrashOnOutOfMemoryError at the beginning of the branch, but before dump_heap_from_oome(), suspending whatever needs to be suspended there.
17.0.13
A DESCRIPTION OF THE PROBLEM :
We run a JVM with -XX:+HeapDumpOnOutOfMemoryError -XX:+CrashOnOutOfMemoryError.
The intention is to stop the application and generate a heap dump, immediately when the OOME is detected.
However, what we notice is that the process of dumping the heap in some cases takes insane amount of time. The reason is that during the heap dump, the GC pauses are still running are being triggered.
My suggestion is to stop the GC threads, when the combination of -XX:+HeapDumpOnOutOfMemoryError -XX:+CrashOnOutOfMemoryError startup parameters is detected. CrashOnOutOfMemoryError basically means that the user doesn't want the JVM to perform any action after OOME.
To see the issue in the logs, run the below program with -XX:+HeapDumpOnOutOfMemoryError -XX:+CrashOnOutOfMemoryError -Xmx3G -verbose:gc
public class OutOfMemoryErrorTest {
public static void main(String[] args) {
// trigger OOM
var list = new LinkedList<String>();
while (true) {
list.add("aaa");
if (list.size() % 1_000_000 == 0) {
System.out.println("List size: " + list.size());
}
}
}
}
You will see logs similar to this.
What is interesting is that between the first occurence of OutOfMemoryError and the file completing, JVM is logging a lot of GC pauses.
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid4489.hprof ...
[85.541s][info][gc] GC(111) Pause Young (Concurrent Start) (G1 Evacuation Pause) 3068M->3068M(3072M) 2.225ms
[85.541s][info][gc] GC(113) Concurrent Mark Cycle
[89.094s][info][gc] GC(112) Pause Full (G1 Compaction Pause) 3068M->3068M(3072M) 3552.943ms
[92.866s][info][gc] GC(114) Pause Full (G1 Compaction Pause) 3068M->3068M(3072M) 3771.683ms
[92.866s][info][gc] GC(113) Concurrent Mark Cycle 7324.984ms
[92.869s][info][gc] GC(115) Pause Young (Normal) (G1 Evacuation Pause) 3068M->3068M(3072M) 2.843ms
[97.040s][info][gc] GC(116) Pause Full (G1 Compaction Pause) 3068M->3068M(3072M) 4171.539ms
[101.603s][info][gc] GC(117) Pause Full (G1 Compaction Pause) 3068M->3068M(3072M) 4562.222ms
Heap dump file created [6543452335 bytes in 138.619 secs]
Aborting due to java.lang.OutOfMemoryError: Java heap space
I think this is because, in this code (debug.cpp), the system doesn't stop the GC on the first detection of OOME.
if (Atomic::cmpxchg(&out_of_memory_reported, 0, 1) == 0) {
// create heap dump before OnOutOfMemoryError commands are executed
if (HeapDumpOnOutOfMemoryError) {
tty->print_cr("java.lang.OutOfMemoryError: %s", message);
HeapDumper::dump_heap_from_oome();
}
if (OnOutOfMemoryError && OnOutOfMemoryError[0]) {
VMError::report_java_out_of_memory(message);
}
if (CrashOnOutOfMemoryError) {
tty->print_cr("Aborting due to java.lang.OutOfMemoryError: %s", message);
report_fatal(OOM_JAVA_HEAP_FATAL, __FILE__, __LINE__, "OutOfMemory encountered: %s", message);
}
This could be handled better by checking CrashOnOutOfMemoryError at the beginning of the branch, but before dump_heap_from_oome(), suspending whatever needs to be suspended there.
- links to
-
Review(master) openjdk/jdk/23519