-
Bug
-
Resolution: Fixed
-
P2
-
8, 11, 17, 21, 23, 24
-
b26
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8346658 | 21.0.7-oracle | Yagmur Eren | P2 | Resolved | Fixed | master |
FAILURE ANALYSIS
---
C2's BoxLock nodes are special in that they are not transformed into platform-specific Mach nodes [1], but kept as Ideal nodes all the way until code emission (see e.g. [2]). In this case, the crash is caused by the elimination in PhaseCFG::remove_empty_blocks() [3] of a basic block (B7 in before-empty-block-removal.pdf) that contains one BoxLock node (153 BoxLock) and no Mach nodes other than an unconditional branch. According to the current logic in Block::is_Empty(), such a block is empty because it does not contain any non-branch Mach node [4]. The removal of B7 causes a segmentation fault when the code emitted by a later node (9 cmpFastUnlock) attempts to load from the address that the (wrongly removed) BoxLock node should have computed in r1 ("box" in [5]).
A potential solution is to extend Block::is_Empty() so that it treats BoxLock and Mach nodes equally.
[1] https://github.com/openjdk/jdk/blob/f0b251d76078e8d5b47e967b0449c4cbdcb5a005/src/hotspot/share/opto/matcher.cpp#L2278
[2] https://github.com/openjdk/jdk/blob/f0b251d76078e8d5b47e967b0449c4cbdcb5a005/src/hotspot/cpu/aarch64/aarch64.ad#L2168-L2195
[3] https://github.com/openjdk/jdk/blob/f0b251d76078e8d5b47e967b0449c4cbdcb5a005/src/hotspot/share/opto/block.cpp#L735-L783
[4] https://github.com/openjdk/jdk/blob/f0b251d76078e8d5b47e967b0449c4cbdcb5a005/src/hotspot/share/opto/block.cpp#L184-L189
[5] https://github.com/openjdk/jdk/blob/ac82a8f89c7066fb1d379b12bcfd68053cb39ba4/src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp#L261
AFFECTED VERSIONS, PLATFORMS, AND CONFIGURATIONS
---
I could reproduce the segmentation fault on JDK 24, JDK 23, JDK 21, JDK 17, and JDK 11 on aarch64 using different JVM flags and a partial backport ofJDK-8292289, see comment in attached TestSynchronizeWithEmptyBlock.java for details. I failed to reproduce the segmentation fault on JDK 8, but code inspection of Block::is_Empty() [1] and different FastUnlock implementations (e.g. [2]), reveals that this JDK version is also potentially affected.
Both x64 and aarch64 platforms are affected. I could reproduce the issue on JDK 24 x64 by tweaking C2's register allocation heuristics (making the wrongly removed BoxLock node not rematerializable and randomizing register assignment).
The issue affects the LM_LEGACY locking mode (LockingMode=1), which is the default configuration in JDK 8-21, and the new LM_LIGHTWEIGHT locking mode (LockingMode=2) if UseObjectMonitorTable is enabled [3]. Currently, UseObjectMonitorTable is disabled by default, but it will likely be enabled in a future release because it is required by the UseCompactObjectHeaders JVM configuration. The LM_MONITOR locking mode (LockingMode=0) is unaffected.
[1] https://github.com/openjdk/jdk8/blob/6a383433a9f4661a96a90b2a4c7b5b9a85720031/hotspot/src/share/vm/opto/block.cpp#L149-L183
[2] https://github.com/openjdk/jdk8/blob/6a383433a9f4661a96a90b2a4c7b5b9a85720031/hotspot/src/cpu/x86/vm/x86_64.ad#L2736
[3] https://github.com/openjdk/jdk/blob/b53ee053f7f7ffcf02ff47e1895ce7be4bc32486/src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp#L603
ORIGINAL REPORT
---
ADDITIONAL SYSTEM INFORMATION :
$ uname -a
Linux localhost.localdomain 4.19.90-2112.8.0.0131.oe1.aarch64 #1 SMP Fri Dec 31 19:53:20 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
$ cat /etc/os-release
NAME="openEuler"
VERSION="20.03 (LTS-SP3)"
ID="openEuler"
VERSION_ID="20.03"
PRETTY_NAME="openEuler 20.03 (LTS-SP3)"
ANSI_COLOR="0;31"
A DESCRIPTION OF THE PROBLEM :
When I run the following Testcase using jdk-17.0.11, I find that the jvm crashes. This is very strange because there is no such problem on jdk8. I tried to add the -Xint option and -Xcomp option when running the test to determine whether it is a JIT problem, but both options make the problem unreproducible.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
path/to/jdk-17.0.11/bin/javac -cp . Test.java
path/to/jdk-17.0.11/bin/java Test
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Continues running in an infinite loop
ACTUAL -
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000fffc692c0e60, pid=3747800, tid=3747801
#
# JRE version: Java(TM) SE Runtime Environment (17.0.11+7) (build 17.0.11+7-LTS-207)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0.11+7-LTS-207, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-aarch64)
# Problematic frame:
# J 9 c2 Test.t()V (118 bytes) @ 0x0000fffc692c0e60 [0x0000fffc692c0c40+0x0000000000000220]
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h" (or dumping to /home/TEMP/proj/core.3747800)
#
# An error report file with more information is saved as:
# /home/TEMP/proj/hs_err_pid3747800.log
#
# If you would like to submit a bug report, please visit:
# https://bugreport.java.com/bugreport/crash.jsp
#
---------- BEGIN SOURCE ----------
class Test {
public static void main(String[] args) {
for (;;)
t();
}
static int var8;
static void t() {
synchronized (Test.class) {
int var12 = 0;
while (var12 < 10000) {
var12++;
if (var12 < 5)
synchronized (new Test()) {
}
}
}
synchronized (Test.class) {
int var4 = 0;
do {
var4++;
if (var4 < 4) {
boolean var10 = true;
int var9 = 0;
do {
var9++;
if (var10)
var8++;
} while (var9 < 20000);
}
} while (var4 < 10000);
}
}
}
---------- END SOURCE ----------
FREQUENCY : always
---
C2's BoxLock nodes are special in that they are not transformed into platform-specific Mach nodes [1], but kept as Ideal nodes all the way until code emission (see e.g. [2]). In this case, the crash is caused by the elimination in PhaseCFG::remove_empty_blocks() [3] of a basic block (B7 in before-empty-block-removal.pdf) that contains one BoxLock node (153 BoxLock) and no Mach nodes other than an unconditional branch. According to the current logic in Block::is_Empty(), such a block is empty because it does not contain any non-branch Mach node [4]. The removal of B7 causes a segmentation fault when the code emitted by a later node (9 cmpFastUnlock) attempts to load from the address that the (wrongly removed) BoxLock node should have computed in r1 ("box" in [5]).
A potential solution is to extend Block::is_Empty() so that it treats BoxLock and Mach nodes equally.
[1] https://github.com/openjdk/jdk/blob/f0b251d76078e8d5b47e967b0449c4cbdcb5a005/src/hotspot/share/opto/matcher.cpp#L2278
[2] https://github.com/openjdk/jdk/blob/f0b251d76078e8d5b47e967b0449c4cbdcb5a005/src/hotspot/cpu/aarch64/aarch64.ad#L2168-L2195
[3] https://github.com/openjdk/jdk/blob/f0b251d76078e8d5b47e967b0449c4cbdcb5a005/src/hotspot/share/opto/block.cpp#L735-L783
[4] https://github.com/openjdk/jdk/blob/f0b251d76078e8d5b47e967b0449c4cbdcb5a005/src/hotspot/share/opto/block.cpp#L184-L189
[5] https://github.com/openjdk/jdk/blob/ac82a8f89c7066fb1d379b12bcfd68053cb39ba4/src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp#L261
AFFECTED VERSIONS, PLATFORMS, AND CONFIGURATIONS
---
I could reproduce the segmentation fault on JDK 24, JDK 23, JDK 21, JDK 17, and JDK 11 on aarch64 using different JVM flags and a partial backport of
Both x64 and aarch64 platforms are affected. I could reproduce the issue on JDK 24 x64 by tweaking C2's register allocation heuristics (making the wrongly removed BoxLock node not rematerializable and randomizing register assignment).
The issue affects the LM_LEGACY locking mode (LockingMode=1), which is the default configuration in JDK 8-21, and the new LM_LIGHTWEIGHT locking mode (LockingMode=2) if UseObjectMonitorTable is enabled [3]. Currently, UseObjectMonitorTable is disabled by default, but it will likely be enabled in a future release because it is required by the UseCompactObjectHeaders JVM configuration. The LM_MONITOR locking mode (LockingMode=0) is unaffected.
[1] https://github.com/openjdk/jdk8/blob/6a383433a9f4661a96a90b2a4c7b5b9a85720031/hotspot/src/share/vm/opto/block.cpp#L149-L183
[2] https://github.com/openjdk/jdk8/blob/6a383433a9f4661a96a90b2a4c7b5b9a85720031/hotspot/src/cpu/x86/vm/x86_64.ad#L2736
[3] https://github.com/openjdk/jdk/blob/b53ee053f7f7ffcf02ff47e1895ce7be4bc32486/src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp#L603
ORIGINAL REPORT
---
ADDITIONAL SYSTEM INFORMATION :
$ uname -a
Linux localhost.localdomain 4.19.90-2112.8.0.0131.oe1.aarch64 #1 SMP Fri Dec 31 19:53:20 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
$ cat /etc/os-release
NAME="openEuler"
VERSION="20.03 (LTS-SP3)"
ID="openEuler"
VERSION_ID="20.03"
PRETTY_NAME="openEuler 20.03 (LTS-SP3)"
ANSI_COLOR="0;31"
A DESCRIPTION OF THE PROBLEM :
When I run the following Testcase using jdk-17.0.11, I find that the jvm crashes. This is very strange because there is no such problem on jdk8. I tried to add the -Xint option and -Xcomp option when running the test to determine whether it is a JIT problem, but both options make the problem unreproducible.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
path/to/jdk-17.0.11/bin/javac -cp . Test.java
path/to/jdk-17.0.11/bin/java Test
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Continues running in an infinite loop
ACTUAL -
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000fffc692c0e60, pid=3747800, tid=3747801
#
# JRE version: Java(TM) SE Runtime Environment (17.0.11+7) (build 17.0.11+7-LTS-207)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0.11+7-LTS-207, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-aarch64)
# Problematic frame:
# J 9 c2 Test.t()V (118 bytes) @ 0x0000fffc692c0e60 [0x0000fffc692c0c40+0x0000000000000220]
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h" (or dumping to /home/TEMP/proj/core.3747800)
#
# An error report file with more information is saved as:
# /home/TEMP/proj/hs_err_pid3747800.log
#
# If you would like to submit a bug report, please visit:
# https://bugreport.java.com/bugreport/crash.jsp
#
---------- BEGIN SOURCE ----------
class Test {
public static void main(String[] args) {
for (;;)
t();
}
static int var8;
static void t() {
synchronized (Test.class) {
int var12 = 0;
while (var12 < 10000) {
var12++;
if (var12 < 5)
synchronized (new Test()) {
}
}
}
synchronized (Test.class) {
int var4 = 0;
do {
var4++;
if (var4 < 4) {
boolean var10 = true;
int var9 = 0;
do {
var9++;
if (var10)
var8++;
} while (var9 < 20000);
}
} while (var4 < 10000);
}
}
}
---------- END SOURCE ----------
FREQUENCY : always
- backported by
-
JDK-8346658 C2: basic blocks with only BoxLock nodes are wrongly treated as empty
- Resolved
- relates to
-
JDK-8345042 C2: simplify modeling of fast locking/unlocking
- Open
- links to
-
Commit(master) openjdk/jdk/01052035
-
Review(master) openjdk/jdk/22038