-
Bug
-
Resolution: Fixed
-
P3
-
17, 21, 25
-
b15
-
linux
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8352387 | 24.0.2 | Thomas Stuefe | P3 | Resolved | Fixed | b01 |
JDK-8353102 | 21.0.8 | Thomas Stuefe | P3 | Resolved | Fixed | master |
One of our customers found that NUMA migrations (more precisely, the OS task getting scheduled to a different NUMA node) can cause G1 to crash if they happen at exactly the wrong moment.
JVM runs with +UseNUMA +UseNUMAInterleaving, G1GC and 4TB heap, two or four NUMA nodes, about 5000 application threads and 159 GC worker threads. JVM crashes (rarely, about once every four hours or so).
Call stacks wildly different, e.g.:
```
28 Stack: [0x00007e506733f000,0x00007e5067540000], sp=0x00007e506753cf10, free space=2039k
29 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
30 V [libjvm.so+0xf32422] Symbol::as_klass_external_name() const+0x12 (symbol.hpp:140)
31 V [libjvm.so+0xda71ff] SharedRuntime::generate_class_cast_message(Klass*, Klass*, Symbol*)+0x1f (sharedRuntime.cpp:2179)
32 V [libjvm.so+0xda99c4] SharedRuntime::generate_class_cast_message(JavaThread*, Klass*)+0xd4 (sharedRuntime.cpp:2171)
33 V [libjvm.so+0x578e2c] Runtime1::throw_class_cast_exception(JavaThread*, oopDesc*)+0x13c (c1_Runtime1.cpp:735)
```
in some crashes, it looks like we load a zero from the heap where no zero should be (eg. as narrow Klass ID from an oop header).
However, if you run a debug JVM, you usually see an assert either in G1Allocator or in CollectedHeap, for example
```
27 Current thread (0x00007fb770087b70): JavaThread "Thread-33" [_thread_in_vm, id=123345, stack(0x00007fb7a86d7000,0x00007fb7a87d8000) (1028K)]
28
29 Stack: [0x00007fb7a86d7000,0x00007fb7a87d8000], sp=0x00007fb7a87d62f0, free space=1020k
30 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
31 V [libjvm.so+0x9fdd6b] CollectedHeap::fill_with_object_impl(HeapWordImpl**, unsigned long, bool) [clone .part.0]+0x2b (collectedHeap.cpp:470)
32 V [libjvm.so+0x9fff1d] CollectedHeap::fill_with_object(HeapWordImpl**, unsigned long, bool)+0x39d (arrayOop.hpp:58)
33 V [libjvm.so+0xc5009f] G1AllocRegion::fill_up_remaining_space(HeapRegion*)+0x1ef (g1AllocRegion.cpp:79)
34 V [libjvm.so+0xc5027c] G1AllocRegion::retire_internal(HeapRegion*, bool)+0x6c (g1AllocRegion.cpp:106)
35 V [libjvm.so+0xc51347] MutatorAllocRegion::retire(bool)+0xb7 (g1AllocRegion.cpp:300)
36 V [libjvm.so+0xc50ed9] G1AllocRegion::new_alloc_region_and_allocate(unsigned long, bool)+0x59 (g1AllocRegion.cpp:139)
37 V [libjvm.so+0xc9b140] G1CollectedHeap::attempt_allocation_slow(unsigned long)+0x6d0 (g1AllocRegion.inline.hpp:120)
38 V [libjvm.so+0xc9e4ff] G1CollectedHeap::attempt_allocation(unsigned long, unsigned long, unsigned long*)+0x39f (g1CollectedHeap.cpp:643)
39 V [libjvm.so+0xc9bd4f] G1CollectedHeap::mem_allocate(unsigned long, bool*)+0x5f (g1CollectedHeap.cpp:401)
40 V [libjvm.so+0x13b9b6d] MemAllocator::mem_allocate_slow(MemAllocator::Allocation&) const+0x5d (memAllocator.cpp:240)
41 V [libjvm.so+0x13b9ca1] MemAllocator::allocate() const+0xa1 (memAllocator.cpp:357)
```
The problem is in `G1Allocator`. `G1AllocRegion` objects tied to NUMA nodes. For most actions involving the `G1Allocator`, we determine the `G1AllocRegion` of the current thread, then redirect the action toward that alloc region. However, due to OS scheduling the NUMA-to-thread-association can change arbitrarily. That means calls to `G1Allocator` are not guaranteed to hit the same `G1AllocRegion` object as last time.
Now, we have control flows that assume that we work with the same `G1AllocRegion` object over their duration, since we build up state in `G1AllocRegion`. The JDK 21 control flow affected is:
```
- `G1CollectedHeap::attempt_allocation_slow`
- `G1Allocator::attempt_allocation_locked` (A)
- `G1AllocRegion::attempt_allocation_locked`
- `G1AllocRegion::attempt_allocation` (try again allocating from HeapRegion under lock protection); failing that:
- `G1AllocRegion::attempt_allocation_using_new_region`
- `G1AllocRegion::retire` (retires current allocation region; may keep it as retained region)
- `G1AllocRegion::new_alloc_region_and_allocate` (allocate new HeapRegion and set it; failing that, sets dummy region), failing that:
- `G1Allocator::attempt_allocation_force` (B)
- `G1AllocRegion::attempt_allocation_force`
- `G1AllocRegion::new_alloc_region_and_allocate`
```
Here, if we change NUMA node from (A) to (B), we will address different `G1AllocRegion` objects. But `G1AllocRegion::attempt_allocation_force` assumes that the current allocation region for this object is retired, which is done by the preceding `G1AllocRegion::attempt_allocation_locked`, but for a different region.
This causes us to abandon the current allocation region; it won't be added to the collection set. On debug JVMs, we hit one of two asserts. We either complain about the current allocation region being not dummy at the entrance of new_alloc_region_and_allocate; In JDK 17, we assert when retire the wrong region, and it is more empty than expected. The effect of this can be delayed, happening on the next retire, since it can affect the retained region.
----
Reproduction and Regression testing
Reproducing the bug is difficult. I did not have a NUMA machine, and even if I had one, NUMA task-node migrations are very rare. Therefore, I build something like a "FakeNUMA" mode which essentially interposes OS NUMA calls and fakes a NUMA system of 8 nodes. I also added a "FakeNUMAStressMigrations" mode mimicking frequent node migrations. With these simple tools, I could reproduce the customer problem (with gc/TestJNICriticalStressTest, slightly modified to increase the number of JNICritical threads). I plan to bring the FakeNUMA mode upstream, but have no time atm to polish it up.
- backported by
-
JDK-8352387 G1: NUMA migrations cause crashes in region allocation
-
- Resolved
-
-
JDK-8353102 G1: NUMA migrations cause crashes in region allocation
-
- Resolved
-
- duplicates
-
JDK-8350490 G1: Memory corruption during attempt_allocation_slow execution path
-
- Closed
-
-
JDK-8351630 Fix NUMA association for the duration of a single G1 Heap allocation
-
- Closed
-
- relates to
-
JDK-8351649 Parallel: NUMA migrations crash the VM
-
- Closed
-
-
JDK-8351526 G1: UseNUMA may cause the JVM not to start if heap is too small
-
- In Progress
-
-
JDK-8351770 Add a "FakeNUMA" mode to fake NUMA support and stress NUMA task migrations
-
- Open
-
- links to
-
Commit(master) openjdk/jdk21u-dev/c5c0ac61
-
Commit(master) openjdk/jdk24u/36765ad3
-
Commit(master) openjdk/jdk/37ec7962
-
Review(master) openjdk/jdk21u-dev/1488
-
Review(master) openjdk/jdk21u/461
-
Review(master) openjdk/jdk24u/138
-
Review(master) openjdk/jdk/23984