Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8358601

Race condition causes FutureTask returned by ScheduledThreadPoolExecutor hangs

XMLWordPrintable

      ADDITIONAL SYSTEM INFORMATION :
      openjdk 23.0.2 2025-01-21
      OpenJDK Runtime Environment (nix) (build 23.0.2+7)
      OpenJDK 64-Bit Server VM (nix) (build 23.0.2+7, mixed mode, sharing)

      A DESCRIPTION OF THE PROBLEM :
      I was adding tests to our deterministic concurrency testing framework [fray](https://github.com/cmu-pasta/fray) and noticed a deadlock exception raised from the following code

      ```
          public static void main(String[] args) throws InterruptedException {
              test();
          }

          public static void test() {
              ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(1);
              new Thread(() -> {
                  executor.shutdown();
              }).start();
              try {
                  ScheduledFuture<?> future = executor.schedule(() -> {
                      Thread.yield();
                  }, 10, TimeUnit.MILLISECONDS);
                  try {
                      future.get();
                      Thread.yield();
                  } catch (Throwable e) {
                  }
              } catch (RejectedExecutionException e) {}
          }
      ```

      And this seems a race condition inside ScheduledThreadPoolExecutor when thread A calls shutdown and thread B schedules a new task:

      Here is an interleaving that triggers the bug:

      Thread A Thread B
      enters `delayedExecute` but does not call `
      super.getQueue().add(task);

                                                                                    enters shutdown/onShutdown/tryTerminate but does not call
                                                                                    `mainLock.lock();`

      calls `super.getQueue().add(task);`
      but not call `ensurePrestart();`

                                                                                    set `ctl` to TIDYING

      evaluate the following statement and then
      return:
                  if (runStateAtLeast(c, SHUTDOWN)
                      && (runStateAtLeast(c, STOP)
                          || firstTask != null
                          || workQueue.isEmpty()))
                      return false;


                                                                                   set `ctl` to TERMINATED

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :

      You may also use Fray to run the test code directly and it will report deadlock exception. https://github.com/cmu-pasta/fray

      - you need to download and build Fray (the test case code is already included in Fray)
      - run fray with fray bin `./bin/fray -cp ./integration-test/build/libs/fray-integration-test-0.4.5-SNAPSHOT.jar org.pastalab.fray.test.core.success.threadpool.ScheduledThreadPoolWorkSteal`
      - To reproduce the failure deterministically you may run `./bin/fray --replay /tmp/report/recording -cp ./integration-test/build/libs/fray-integration-test-0.4.5-SNAPSHOT.jar org.pastalab.fray.test.core.success.threadpool.ScheduledThreadPoolWorkSteal`
      - Note that `./bin/fray` is just a bash wrapper of a java command, so you may add additional arguments such as `-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005` to enable debugging.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      FutureTask should be canceled
      ACTUAL -
      Main thread hangs:
      ```
      2025-06-01 12:40:27
      Full thread dump OpenJDK 64-Bit Server VM (23-internal-adhoc.aoli.jdk mixed mode):

      Threads class SMR info:
      _java_thread_list=0x00007fb7e0001f20, length=13, elements={
      0x00007fb83402ae40, 0x00007fb8341dccb0, 0x00007fb8341de150, 0x00007fb8341df860,
      0x00007fb8341e0dd0, 0x00007fb8341e2370, 0x00007fb8341e3e90, 0x00007fb8341e55a0,
      0x00007fb834221360, 0x00007fb83437cd30, 0x00007fb83437e150, 0x00007fb7c0001200,
      0x00007fb7e0000f50
      }

      "main" #1 [1724109] prio=5 os_prio=0 cpu=75.54ms elapsed=98.22s tid=0x00007fb83402ae40 nid=1724109 waiting on condition [0x00007fb8393fe000]
         java.lang.Thread.State: WAITING (parking)
              at jdk.internal.misc.Unsafe.park(java.base/Native Method)
              - parking to wait for <0x00000006364f0440> (a java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
              at java.util.concurrent.locks.LockSupport.park(java.base/LockSupport.java:221)
              at java.util.concurrent.FutureTask.awaitDone(java.base/FutureTask.java:500)
              at java.util.concurrent.FutureTask.get(java.base/FutureTask.java:190)
              at org.pastalab.fray.test.core.success.threadpool.ScheduledThreadPoolWorkSteal.test(ScheduledThreadPoolWorkSteal.java:21)
              at org.pastalab.fray.test.core.success.threadpool.ScheduledThreadPoolWorkSteal.main(ScheduledThreadPoolWorkSteal.java:8)

      "Reference Handler" #9 [1724117] daemon prio=10 os_prio=0 cpu=0.26ms elapsed=98.20s tid=0x00007fb8341dccb0 nid=1724117 waiting on condition [0x00007fb818c19000]
         java.lang.Thread.State: RUNNABLE
              at java.lang.ref.Reference.waitForReferencePendingList(java.base/Native Method)
              at java.lang.ref.Reference.processPendingReferences(java.base/Reference.java:246)
              at java.lang.ref.Reference$ReferenceHandler.run(java.base/Reference.java:208)

      "Finalizer" #10 [1724118] daemon prio=8 os_prio=0 cpu=0.10ms elapsed=98.20s tid=0x00007fb8341de150 nid=1724118 in Object.wait() [0x00007fb818b19000]
         java.lang.Thread.State: WAITING (on object monitor)
              at java.lang.Object.wait0(java.base/Native Method)
              - waiting on <0x000000063680a560> (a java.lang.ref.NativeReferenceQueue$Lock)
              at java.lang.Object.wait(java.base/Object.java:378)
              at java.lang.Object.wait(java.base/Object.java:352)
              at java.lang.ref.NativeReferenceQueue.await(java.base/NativeReferenceQueue.java:48)
              at java.lang.ref.ReferenceQueue.remove0(java.base/ReferenceQueue.java:166)
              at java.lang.ref.NativeReferenceQueue.remove(java.base/NativeReferenceQueue.java:89)
              - locked <0x000000063680a560> (a java.lang.ref.NativeReferenceQueue$Lock)
              at java.lang.ref.Finalizer$FinalizerThread.run(java.base/Finalizer.java:173)

      "Signal Dispatcher" #11 [1724119] daemon prio=9 os_prio=0 cpu=0.19ms elapsed=98.20s tid=0x00007fb8341df860 nid=1724119 waiting on condition [0x0000000000000000]
         java.lang.Thread.State: RUNNABLE

      "Service Thread" #12 [1724120] daemon prio=9 os_prio=0 cpu=0.90ms elapsed=98.20s tid=0x00007fb8341e0dd0 nid=1724120 runnable [0x0000000000000000]
         java.lang.Thread.State: RUNNABLE

      "Monitor Deflation Thread" #13 [1724121] daemon prio=9 os_prio=0 cpu=2.62ms elapsed=98.20s tid=0x00007fb8341e2370 nid=1724121 runnable [0x0000000000000000]
         java.lang.Thread.State: RUNNABLE

      "C2 CompilerThread0" #14 [1724122] daemon prio=9 os_prio=0 cpu=28.35ms elapsed=98.20s tid=0x00007fb8341e3e90 nid=1724122 waiting on condition [0x0000000000000000]
         java.lang.Thread.State: RUNNABLE
         No compile task

      "C1 CompilerThread0" #17 [1724123] daemon prio=9 os_prio=0 cpu=30.76ms elapsed=98.20s tid=0x00007fb8341e55a0 nid=1724123 waiting on condition [0x0000000000000000]
         java.lang.Thread.State: RUNNABLE
         No compile task

      "Common-Cleaner" #18 [1724124] daemon prio=8 os_prio=0 cpu=0.42ms elapsed=98.19s tid=0x00007fb834221360 nid=1724124 waiting on condition [0x00007fb818448000]
         java.lang.Thread.State: TIMED_WAITING (parking)
              at jdk.internal.misc.Unsafe.park(java.base/Native Method)
              - parking to wait for <0x00000006368290d8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
              at java.util.concurrent.locks.LockSupport.parkNanos(java.base/LockSupport.java:269)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base/AbstractQueuedSynchronizer.java:1852)
              at java.lang.ref.ReferenceQueue.await(java.base/ReferenceQueue.java:79)
              at java.lang.ref.ReferenceQueue.remove0(java.base/ReferenceQueue.java:151)
              at java.lang.ref.ReferenceQueue.remove(java.base/ReferenceQueue.java:229)
              at jdk.internal.ref.CleanerImpl.run(java.base/CleanerImpl.java:140)
              at java.lang.Thread.runWith(java.base/Thread.java:1588)
              at java.lang.Thread.run(java.base/Thread.java:1575)
              at jdk.internal.misc.InnocuousThread.run(java.base/InnocuousThread.java:186)

      "JDWP Event Helper Thread" #20 [1724128] daemon prio=10 os_prio=0 cpu=0.15ms elapsed=98.15s tid=0x00007fb83437cd30 nid=1724128 runnable [0x0000000000000000]
         java.lang.Thread.State: RUNNABLE

      "Notification Thread" #21 [1724129] daemon prio=9 os_prio=0 cpu=0.05ms elapsed=98.15s tid=0x00007fb83437e150 nid=1724129 runnable [0x0000000000000000]
         java.lang.Thread.State: RUNNABLE

      "JDWP Transport Listener: dt_socket" #24 [1724686] daemon prio=10 os_prio=0 cpu=0.04ms elapsed=70.50s tid=0x00007fb7c0001200 nid=1724686 runnable [0x0000000000000000]
         java.lang.Thread.State: RUNNABLE

      "Attach Listener" #25 [1726080] daemon prio=9 os_prio=0 cpu=0.25ms elapsed=0.10s tid=0x00007fb7e0000f50 nid=1726080 waiting on condition [0x0000000000000000]
         java.lang.Thread.State: RUNNABLE

      "VM Thread" os_prio=0 cpu=0.76ms elapsed=98.21s tid=0x00007fb83416ff80 nid=1724116 runnable

      "VM Periodic Task Thread" os_prio=0 cpu=12.35ms elapsed=98.21s tid=0x00007fb834115790 nid=1724115 waiting on condition

      "G1 Service" os_prio=0 cpu=1.49ms elapsed=98.21s tid=0x00007fb8340fd320 nid=1724114 runnable

      "G1 Refine#0" os_prio=0 cpu=0.04ms elapsed=98.21s tid=0x00007fb8340fc360 nid=1724113 runnable

      "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=98.22s tid=0x00007fb834062b90 nid=1724112 runnable

      "G1 Main Marker" os_prio=0 cpu=0.04ms elapsed=98.22s tid=0x00007fb834061bf0 nid=1724111 runnable

      "GC Thread#0" os_prio=0 cpu=0.04ms elapsed=98.22s tid=0x00007fb83404d4a0 nid=1724110 runnable

      JNI global refs: 29, weak refs: 0
      ```


      ---------- BEGIN SOURCE ----------
          public static void main(String[] args) throws InterruptedException {
              test();
          }

          public static void test() {
              ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(1);
              new Thread(() -> {
                  executor.shutdown();
              }).start();
              try {
                  ScheduledFuture<?> future = executor.schedule(() -> {
                      Thread.yield();
                  }, 10, TimeUnit.MILLISECONDS);
                  try {
                      future.get();
                      Thread.yield();
                  } catch (Throwable e) {
                  }
              } catch (RejectedExecutionException e) {}
          }
      ---------- END SOURCE ----------

            vklang Viktor Klang
            eaymane EL Haimer Aymane
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: