-
Bug
-
Resolution: Unresolved
-
P4
-
23
-
x86
-
linux
ADDITIONAL SYSTEM INFORMATION :
openjdk 23.0.2 2025-01-21
OpenJDK Runtime Environment (nix) (build 23.0.2+7)
OpenJDK 64-Bit Server VM (nix) (build 23.0.2+7, mixed mode, sharing)
A DESCRIPTION OF THE PROBLEM :
I was adding tests to our deterministic concurrency testing framework [fray](https://github.com/cmu-pasta/fray) and noticed a deadlock exception raised from the following code
```
public static void main(String[] args) throws InterruptedException {
test();
}
public static void test() {
ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(1);
new Thread(() -> {
executor.shutdown();
}).start();
try {
ScheduledFuture<?> future = executor.schedule(() -> {
Thread.yield();
}, 10, TimeUnit.MILLISECONDS);
try {
future.get();
Thread.yield();
} catch (Throwable e) {
}
} catch (RejectedExecutionException e) {}
}
```
And this seems a race condition inside ScheduledThreadPoolExecutor when thread A calls shutdown and thread B schedules a new task:
Here is an interleaving that triggers the bug:
Thread A Thread B
enters `delayedExecute` but does not call `
super.getQueue().add(task);
enters shutdown/onShutdown/tryTerminate but does not call
`mainLock.lock();`
calls `super.getQueue().add(task);`
but not call `ensurePrestart();`
set `ctl` to TIDYING
evaluate the following statement and then
return:
if (runStateAtLeast(c, SHUTDOWN)
&& (runStateAtLeast(c, STOP)
|| firstTask != null
|| workQueue.isEmpty()))
return false;
set `ctl` to TERMINATED
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
You may also use Fray to run the test code directly and it will report deadlock exception. https://github.com/cmu-pasta/fray
- you need to download and build Fray (the test case code is already included in Fray)
- run fray with fray bin `./bin/fray -cp ./integration-test/build/libs/fray-integration-test-0.4.5-SNAPSHOT.jar org.pastalab.fray.test.core.success.threadpool.ScheduledThreadPoolWorkSteal`
- To reproduce the failure deterministically you may run `./bin/fray --replay /tmp/report/recording -cp ./integration-test/build/libs/fray-integration-test-0.4.5-SNAPSHOT.jar org.pastalab.fray.test.core.success.threadpool.ScheduledThreadPoolWorkSteal`
- Note that `./bin/fray` is just a bash wrapper of a java command, so you may add additional arguments such as `-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005` to enable debugging.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
FutureTask should be canceled
ACTUAL -
Main thread hangs:
```
2025-06-01 12:40:27
Full thread dump OpenJDK 64-Bit Server VM (23-internal-adhoc.aoli.jdk mixed mode):
Threads class SMR info:
_java_thread_list=0x00007fb7e0001f20, length=13, elements={
0x00007fb83402ae40, 0x00007fb8341dccb0, 0x00007fb8341de150, 0x00007fb8341df860,
0x00007fb8341e0dd0, 0x00007fb8341e2370, 0x00007fb8341e3e90, 0x00007fb8341e55a0,
0x00007fb834221360, 0x00007fb83437cd30, 0x00007fb83437e150, 0x00007fb7c0001200,
0x00007fb7e0000f50
}
"main" #1 [1724109] prio=5 os_prio=0 cpu=75.54ms elapsed=98.22s tid=0x00007fb83402ae40 nid=1724109 waiting on condition [0x00007fb8393fe000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base/Native Method)
- parking to wait for <0x00000006364f0440> (a java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
at java.util.concurrent.locks.LockSupport.park(java.base/LockSupport.java:221)
at java.util.concurrent.FutureTask.awaitDone(java.base/FutureTask.java:500)
at java.util.concurrent.FutureTask.get(java.base/FutureTask.java:190)
at org.pastalab.fray.test.core.success.threadpool.ScheduledThreadPoolWorkSteal.test(ScheduledThreadPoolWorkSteal.java:21)
at org.pastalab.fray.test.core.success.threadpool.ScheduledThreadPoolWorkSteal.main(ScheduledThreadPoolWorkSteal.java:8)
"Reference Handler" #9 [1724117] daemon prio=10 os_prio=0 cpu=0.26ms elapsed=98.20s tid=0x00007fb8341dccb0 nid=1724117 waiting on condition [0x00007fb818c19000]
java.lang.Thread.State: RUNNABLE
at java.lang.ref.Reference.waitForReferencePendingList(java.base/Native Method)
at java.lang.ref.Reference.processPendingReferences(java.base/Reference.java:246)
at java.lang.ref.Reference$ReferenceHandler.run(java.base/Reference.java:208)
"Finalizer" #10 [1724118] daemon prio=8 os_prio=0 cpu=0.10ms elapsed=98.20s tid=0x00007fb8341de150 nid=1724118 in Object.wait() [0x00007fb818b19000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait0(java.base/Native Method)
- waiting on <0x000000063680a560> (a java.lang.ref.NativeReferenceQueue$Lock)
at java.lang.Object.wait(java.base/Object.java:378)
at java.lang.Object.wait(java.base/Object.java:352)
at java.lang.ref.NativeReferenceQueue.await(java.base/NativeReferenceQueue.java:48)
at java.lang.ref.ReferenceQueue.remove0(java.base/ReferenceQueue.java:166)
at java.lang.ref.NativeReferenceQueue.remove(java.base/NativeReferenceQueue.java:89)
- locked <0x000000063680a560> (a java.lang.ref.NativeReferenceQueue$Lock)
at java.lang.ref.Finalizer$FinalizerThread.run(java.base/Finalizer.java:173)
"Signal Dispatcher" #11 [1724119] daemon prio=9 os_prio=0 cpu=0.19ms elapsed=98.20s tid=0x00007fb8341df860 nid=1724119 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Service Thread" #12 [1724120] daemon prio=9 os_prio=0 cpu=0.90ms elapsed=98.20s tid=0x00007fb8341e0dd0 nid=1724120 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Monitor Deflation Thread" #13 [1724121] daemon prio=9 os_prio=0 cpu=2.62ms elapsed=98.20s tid=0x00007fb8341e2370 nid=1724121 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #14 [1724122] daemon prio=9 os_prio=0 cpu=28.35ms elapsed=98.20s tid=0x00007fb8341e3e90 nid=1724122 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
No compile task
"C1 CompilerThread0" #17 [1724123] daemon prio=9 os_prio=0 cpu=30.76ms elapsed=98.20s tid=0x00007fb8341e55a0 nid=1724123 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
No compile task
"Common-Cleaner" #18 [1724124] daemon prio=8 os_prio=0 cpu=0.42ms elapsed=98.19s tid=0x00007fb834221360 nid=1724124 waiting on condition [0x00007fb818448000]
java.lang.Thread.State: TIMED_WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base/Native Method)
- parking to wait for <0x00000006368290d8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(java.base/LockSupport.java:269)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base/AbstractQueuedSynchronizer.java:1852)
at java.lang.ref.ReferenceQueue.await(java.base/ReferenceQueue.java:79)
at java.lang.ref.ReferenceQueue.remove0(java.base/ReferenceQueue.java:151)
at java.lang.ref.ReferenceQueue.remove(java.base/ReferenceQueue.java:229)
at jdk.internal.ref.CleanerImpl.run(java.base/CleanerImpl.java:140)
at java.lang.Thread.runWith(java.base/Thread.java:1588)
at java.lang.Thread.run(java.base/Thread.java:1575)
at jdk.internal.misc.InnocuousThread.run(java.base/InnocuousThread.java:186)
"JDWP Event Helper Thread" #20 [1724128] daemon prio=10 os_prio=0 cpu=0.15ms elapsed=98.15s tid=0x00007fb83437cd30 nid=1724128 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Notification Thread" #21 [1724129] daemon prio=9 os_prio=0 cpu=0.05ms elapsed=98.15s tid=0x00007fb83437e150 nid=1724129 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"JDWP Transport Listener: dt_socket" #24 [1724686] daemon prio=10 os_prio=0 cpu=0.04ms elapsed=70.50s tid=0x00007fb7c0001200 nid=1724686 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Attach Listener" #25 [1726080] daemon prio=9 os_prio=0 cpu=0.25ms elapsed=0.10s tid=0x00007fb7e0000f50 nid=1726080 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"VM Thread" os_prio=0 cpu=0.76ms elapsed=98.21s tid=0x00007fb83416ff80 nid=1724116 runnable
"VM Periodic Task Thread" os_prio=0 cpu=12.35ms elapsed=98.21s tid=0x00007fb834115790 nid=1724115 waiting on condition
"G1 Service" os_prio=0 cpu=1.49ms elapsed=98.21s tid=0x00007fb8340fd320 nid=1724114 runnable
"G1 Refine#0" os_prio=0 cpu=0.04ms elapsed=98.21s tid=0x00007fb8340fc360 nid=1724113 runnable
"G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=98.22s tid=0x00007fb834062b90 nid=1724112 runnable
"G1 Main Marker" os_prio=0 cpu=0.04ms elapsed=98.22s tid=0x00007fb834061bf0 nid=1724111 runnable
"GC Thread#0" os_prio=0 cpu=0.04ms elapsed=98.22s tid=0x00007fb83404d4a0 nid=1724110 runnable
JNI global refs: 29, weak refs: 0
```
---------- BEGIN SOURCE ----------
public static void main(String[] args) throws InterruptedException {
test();
}
public static void test() {
ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(1);
new Thread(() -> {
executor.shutdown();
}).start();
try {
ScheduledFuture<?> future = executor.schedule(() -> {
Thread.yield();
}, 10, TimeUnit.MILLISECONDS);
try {
future.get();
Thread.yield();
} catch (Throwable e) {
}
} catch (RejectedExecutionException e) {}
}
---------- END SOURCE ----------
openjdk 23.0.2 2025-01-21
OpenJDK Runtime Environment (nix) (build 23.0.2+7)
OpenJDK 64-Bit Server VM (nix) (build 23.0.2+7, mixed mode, sharing)
A DESCRIPTION OF THE PROBLEM :
I was adding tests to our deterministic concurrency testing framework [fray](https://github.com/cmu-pasta/fray) and noticed a deadlock exception raised from the following code
```
public static void main(String[] args) throws InterruptedException {
test();
}
public static void test() {
ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(1);
new Thread(() -> {
executor.shutdown();
}).start();
try {
ScheduledFuture<?> future = executor.schedule(() -> {
Thread.yield();
}, 10, TimeUnit.MILLISECONDS);
try {
future.get();
Thread.yield();
} catch (Throwable e) {
}
} catch (RejectedExecutionException e) {}
}
```
And this seems a race condition inside ScheduledThreadPoolExecutor when thread A calls shutdown and thread B schedules a new task:
Here is an interleaving that triggers the bug:
Thread A Thread B
enters `delayedExecute` but does not call `
super.getQueue().add(task);
enters shutdown/onShutdown/tryTerminate but does not call
`mainLock.lock();`
calls `super.getQueue().add(task);`
but not call `ensurePrestart();`
set `ctl` to TIDYING
evaluate the following statement and then
return:
if (runStateAtLeast(c, SHUTDOWN)
&& (runStateAtLeast(c, STOP)
|| firstTask != null
|| workQueue.isEmpty()))
return false;
set `ctl` to TERMINATED
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
You may also use Fray to run the test code directly and it will report deadlock exception. https://github.com/cmu-pasta/fray
- you need to download and build Fray (the test case code is already included in Fray)
- run fray with fray bin `./bin/fray -cp ./integration-test/build/libs/fray-integration-test-0.4.5-SNAPSHOT.jar org.pastalab.fray.test.core.success.threadpool.ScheduledThreadPoolWorkSteal`
- To reproduce the failure deterministically you may run `./bin/fray --replay /tmp/report/recording -cp ./integration-test/build/libs/fray-integration-test-0.4.5-SNAPSHOT.jar org.pastalab.fray.test.core.success.threadpool.ScheduledThreadPoolWorkSteal`
- Note that `./bin/fray` is just a bash wrapper of a java command, so you may add additional arguments such as `-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005` to enable debugging.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
FutureTask should be canceled
ACTUAL -
Main thread hangs:
```
2025-06-01 12:40:27
Full thread dump OpenJDK 64-Bit Server VM (23-internal-adhoc.aoli.jdk mixed mode):
Threads class SMR info:
_java_thread_list=0x00007fb7e0001f20, length=13, elements={
0x00007fb83402ae40, 0x00007fb8341dccb0, 0x00007fb8341de150, 0x00007fb8341df860,
0x00007fb8341e0dd0, 0x00007fb8341e2370, 0x00007fb8341e3e90, 0x00007fb8341e55a0,
0x00007fb834221360, 0x00007fb83437cd30, 0x00007fb83437e150, 0x00007fb7c0001200,
0x00007fb7e0000f50
}
"main" #1 [1724109] prio=5 os_prio=0 cpu=75.54ms elapsed=98.22s tid=0x00007fb83402ae40 nid=1724109 waiting on condition [0x00007fb8393fe000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base/Native Method)
- parking to wait for <0x00000006364f0440> (a java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
at java.util.concurrent.locks.LockSupport.park(java.base/LockSupport.java:221)
at java.util.concurrent.FutureTask.awaitDone(java.base/FutureTask.java:500)
at java.util.concurrent.FutureTask.get(java.base/FutureTask.java:190)
at org.pastalab.fray.test.core.success.threadpool.ScheduledThreadPoolWorkSteal.test(ScheduledThreadPoolWorkSteal.java:21)
at org.pastalab.fray.test.core.success.threadpool.ScheduledThreadPoolWorkSteal.main(ScheduledThreadPoolWorkSteal.java:8)
"Reference Handler" #9 [1724117] daemon prio=10 os_prio=0 cpu=0.26ms elapsed=98.20s tid=0x00007fb8341dccb0 nid=1724117 waiting on condition [0x00007fb818c19000]
java.lang.Thread.State: RUNNABLE
at java.lang.ref.Reference.waitForReferencePendingList(java.base/Native Method)
at java.lang.ref.Reference.processPendingReferences(java.base/Reference.java:246)
at java.lang.ref.Reference$ReferenceHandler.run(java.base/Reference.java:208)
"Finalizer" #10 [1724118] daemon prio=8 os_prio=0 cpu=0.10ms elapsed=98.20s tid=0x00007fb8341de150 nid=1724118 in Object.wait() [0x00007fb818b19000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait0(java.base/Native Method)
- waiting on <0x000000063680a560> (a java.lang.ref.NativeReferenceQueue$Lock)
at java.lang.Object.wait(java.base/Object.java:378)
at java.lang.Object.wait(java.base/Object.java:352)
at java.lang.ref.NativeReferenceQueue.await(java.base/NativeReferenceQueue.java:48)
at java.lang.ref.ReferenceQueue.remove0(java.base/ReferenceQueue.java:166)
at java.lang.ref.NativeReferenceQueue.remove(java.base/NativeReferenceQueue.java:89)
- locked <0x000000063680a560> (a java.lang.ref.NativeReferenceQueue$Lock)
at java.lang.ref.Finalizer$FinalizerThread.run(java.base/Finalizer.java:173)
"Signal Dispatcher" #11 [1724119] daemon prio=9 os_prio=0 cpu=0.19ms elapsed=98.20s tid=0x00007fb8341df860 nid=1724119 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Service Thread" #12 [1724120] daemon prio=9 os_prio=0 cpu=0.90ms elapsed=98.20s tid=0x00007fb8341e0dd0 nid=1724120 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Monitor Deflation Thread" #13 [1724121] daemon prio=9 os_prio=0 cpu=2.62ms elapsed=98.20s tid=0x00007fb8341e2370 nid=1724121 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #14 [1724122] daemon prio=9 os_prio=0 cpu=28.35ms elapsed=98.20s tid=0x00007fb8341e3e90 nid=1724122 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
No compile task
"C1 CompilerThread0" #17 [1724123] daemon prio=9 os_prio=0 cpu=30.76ms elapsed=98.20s tid=0x00007fb8341e55a0 nid=1724123 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
No compile task
"Common-Cleaner" #18 [1724124] daemon prio=8 os_prio=0 cpu=0.42ms elapsed=98.19s tid=0x00007fb834221360 nid=1724124 waiting on condition [0x00007fb818448000]
java.lang.Thread.State: TIMED_WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base/Native Method)
- parking to wait for <0x00000006368290d8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(java.base/LockSupport.java:269)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base/AbstractQueuedSynchronizer.java:1852)
at java.lang.ref.ReferenceQueue.await(java.base/ReferenceQueue.java:79)
at java.lang.ref.ReferenceQueue.remove0(java.base/ReferenceQueue.java:151)
at java.lang.ref.ReferenceQueue.remove(java.base/ReferenceQueue.java:229)
at jdk.internal.ref.CleanerImpl.run(java.base/CleanerImpl.java:140)
at java.lang.Thread.runWith(java.base/Thread.java:1588)
at java.lang.Thread.run(java.base/Thread.java:1575)
at jdk.internal.misc.InnocuousThread.run(java.base/InnocuousThread.java:186)
"JDWP Event Helper Thread" #20 [1724128] daemon prio=10 os_prio=0 cpu=0.15ms elapsed=98.15s tid=0x00007fb83437cd30 nid=1724128 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Notification Thread" #21 [1724129] daemon prio=9 os_prio=0 cpu=0.05ms elapsed=98.15s tid=0x00007fb83437e150 nid=1724129 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"JDWP Transport Listener: dt_socket" #24 [1724686] daemon prio=10 os_prio=0 cpu=0.04ms elapsed=70.50s tid=0x00007fb7c0001200 nid=1724686 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Attach Listener" #25 [1726080] daemon prio=9 os_prio=0 cpu=0.25ms elapsed=0.10s tid=0x00007fb7e0000f50 nid=1726080 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"VM Thread" os_prio=0 cpu=0.76ms elapsed=98.21s tid=0x00007fb83416ff80 nid=1724116 runnable
"VM Periodic Task Thread" os_prio=0 cpu=12.35ms elapsed=98.21s tid=0x00007fb834115790 nid=1724115 waiting on condition
"G1 Service" os_prio=0 cpu=1.49ms elapsed=98.21s tid=0x00007fb8340fd320 nid=1724114 runnable
"G1 Refine#0" os_prio=0 cpu=0.04ms elapsed=98.21s tid=0x00007fb8340fc360 nid=1724113 runnable
"G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=98.22s tid=0x00007fb834062b90 nid=1724112 runnable
"G1 Main Marker" os_prio=0 cpu=0.04ms elapsed=98.22s tid=0x00007fb834061bf0 nid=1724111 runnable
"GC Thread#0" os_prio=0 cpu=0.04ms elapsed=98.22s tid=0x00007fb83404d4a0 nid=1724110 runnable
JNI global refs: 29, weak refs: 0
```
---------- BEGIN SOURCE ----------
public static void main(String[] args) throws InterruptedException {
test();
}
public static void test() {
ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(1);
new Thread(() -> {
executor.shutdown();
}).start();
try {
ScheduledFuture<?> future = executor.schedule(() -> {
Thread.yield();
}, 10, TimeUnit.MILLISECONDS);
try {
future.get();
Thread.yield();
} catch (Throwable e) {
}
} catch (RejectedExecutionException e) {}
}
---------- END SOURCE ----------