Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8305670

Performance regression in LockSupport.unpark with lots of idle threads

    XMLWordPrintable

Details

    • b23

    Backports

      Description

        We noticed latency degradation when migrating an app from JDK 8 to JDK 17 or 20. The app has more than 8K threads most of which are idle. Analysis showed a large amount of CPU time spent in ThreadsListHandle::cv_internal_thread_to_JavaThread called from Unsafe_Unpark.

        I could reproduce the issue locally with a simple test case - see attached UnparkRegression.java

        It creates 10K idle threads that sleep indefinitely and two active threads communicating to each other via CyclicBarrier. The particular synchronization primitive does not matter: any java.util.concurrent class that eventually calls LockSupport.unpark is affected by the issue.

        Running the test with JDK 17 or JDK 20 on 4-core ARM64 machine, it does 70K roundtrips per second, while with JDK 8 it does 162K (2.3x more).

        The issue appeared in JDK 10 with the introduction of Thread-SMR. The problem is the linear search in ThreadsList::includes called by cv_internal_thread_to_JavaThread:
        https://github.com/openjdk/jdk/blob/44f33ad1a9617fc23864c9ba5f063b3fc2f1e18c/src/hotspot/share/runtime/threadSMR.cpp#L829

        The call to ThreadList::includes is guarded by a diagnostic flag `EnableThreadSMRExtraValidityChecks` which is enabled by default.
        After adding -XX:-EnableThreadSMRExtraValidityChecks, the performance returns back to JDK 8 levels.

        Attachments

          Issue Links

            Activity

              People

                dcubed Daniel Daugherty
                apangin Andrei Pangin
                Votes:
                0 Vote for this issue
                Watchers:
                13 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: