-
Bug
-
Resolution: Unresolved
-
P4
-
21, 23, 24
I have been debugging a weird crash in Shenandoah tests. The symptom is that we are encountering a java.lang.Class object with a dead Klass*. I managed to drill it down to the following sequence of events.
0. j.l.Class holds the reference to associated InstanceKlass (IK) in one of its metadata fields. IK knows about associated j.l.Class via its java_mirror field.
1. The test encounters Thread.sleep. This forces classloading for sleep event, and jdk.internal.event.Event class.
2. Normal classloader runs and initializes InstanceKlass1 (IK1) and associated j.l.Class mirror (CM1). CM1 points to IK1. IK1 points to CM1.
3. JFR transformer runs on the Event (JDK-8282420). This creates IK2 with CM2. Transformer installs IK2 to newly created Event object, and marks IK1 for deallocation, as the "previous version of IK". Class unloading still have not executed, therefore IK1 is still reachable from ClassLoaderData (CLD), and thus CM1 is reachable as well.
4. GC marking runs. CM1 is marked via CLD.
5. CM1 selected for evacuation, we evacuate it to CM1*. Roots get fixed, so IK1 knows about CM1* now.
6. GC class unloading runs. IK "previous versions" cleanup runs, and IK1 gets deallocated. When IK1 deallocates, it goes to its Java mirror and nulls it out, so that IK1 is no longer reachable through the Java mirror. In this setup, IK goes to CM1* (step 5), and sets its IK reference to nullptr, breaking the link between the two. At this point, CM1* points to nullptr. But CM1 still points to "garbage" unallocated IK1!
7. GC heap references update runs. Verifier runs prior to it. Since CM1 is marked (step 4), GC visits it. Since CM1 is java.lang.Class, reference updater visits its IK (I think for visiting class static fields, not 100% sure). Normally we would visit CM1*, which would have IK as nullptr, which would then be skipped. But instead, we are visiting CM1, which points to IK1 (now garbage).
8. GC Verifier crashes.
Can be reproduced more precisely with Verifier after amending it with the attached patch, and:
$ CONF=linux-x86_64-server-fastdebug make test TEST=gc/shenandoah/jni/TestJNIGlobalRefs.java JTREG=REPEAT_COUNT=10
This reproducer relies on Verifier going into cset regions and encountering CM1 there. In normal update references step itself, we do not touch cset regions, and runtime is supposed to execute LRB before accessing any object state. There is a suspicious place in evacuation code itself, however, which _does_ process the cset. AFAICS, it follows the similar (bad) path on Step 7.
0. j.l.Class holds the reference to associated InstanceKlass (IK) in one of its metadata fields. IK knows about associated j.l.Class via its java_mirror field.
1. The test encounters Thread.sleep. This forces classloading for sleep event, and jdk.internal.event.Event class.
2. Normal classloader runs and initializes InstanceKlass1 (IK1) and associated j.l.Class mirror (CM1). CM1 points to IK1. IK1 points to CM1.
3. JFR transformer runs on the Event (
4. GC marking runs. CM1 is marked via CLD.
5. CM1 selected for evacuation, we evacuate it to CM1*. Roots get fixed, so IK1 knows about CM1* now.
6. GC class unloading runs. IK "previous versions" cleanup runs, and IK1 gets deallocated. When IK1 deallocates, it goes to its Java mirror and nulls it out, so that IK1 is no longer reachable through the Java mirror. In this setup, IK goes to CM1* (step 5), and sets its IK reference to nullptr, breaking the link between the two. At this point, CM1* points to nullptr. But CM1 still points to "garbage" unallocated IK1!
7. GC heap references update runs. Verifier runs prior to it. Since CM1 is marked (step 4), GC visits it. Since CM1 is java.lang.Class, reference updater visits its IK (I think for visiting class static fields, not 100% sure). Normally we would visit CM1*, which would have IK as nullptr, which would then be skipped. But instead, we are visiting CM1, which points to IK1 (now garbage).
8. GC Verifier crashes.
Can be reproduced more precisely with Verifier after amending it with the attached patch, and:
$ CONF=linux-x86_64-server-fastdebug make test TEST=gc/shenandoah/jni/TestJNIGlobalRefs.java JTREG=REPEAT_COUNT=10
This reproducer relies on Verifier going into cset regions and encountering CM1 there. In normal update references step itself, we do not touch cset regions, and runtime is supposed to execute LRB before accessing any object state. There is a suspicious place in evacuation code itself, however, which _does_ process the cset. AFAICS, it follows the similar (bad) path on Step 7.
- relates to
-
JDK-8340381 Shenandoah: Class mirrors verification should check forwarded objects
- Open
-
JDK-8340297 Metaspace API for checking if address is in use
- In Progress
-
JDK-8282420 JFR: Remove event handlers
- Resolved
- links to
-
Review(master) openjdk/jdk/21054