Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6470570

Modify suspend/resume code to work better with the safepoint protocol

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • P4
    • hs10
    • 7
    • hotspot
    • None
    • b03
    • generic
    • generic

    Backports

      Description

        There are some rough edges on the suspend/resume code with regard to the safepoint protocol.

        Issue 1: Looking at check_safepoint_and_suspend_for_native_trans (comments and asserts elided):

        1999 void JavaThread::check_safepoint_and_suspend_for_native_trans(JavaThread *thread) {
        2002 JavaThread *curJT = JavaThread::current();
        2003 bool do_self_suspend = thread->is_external_suspend();
        2011 if (do_self_suspend && (!AllowJNIEnvProxy || curJT == thread)) {
        2012 JavaThreadState state = thread->thread_state();
        2017 thread->set_suspend_equivalent();
        2030 thread->set_thread_state(_thread_blocked);
        2031 thread->java_suspend_self();
        2032 thread->set_thread_state(state);
        2033 }
        2034
        2035 if (SafepointSynchronize::do_call_back()) {
        2038 SafepointSynchronize::block(curJT);
        2039 }
        2040 }

        If the write to the thread state in line 2032 doesn't become visible to the VMThread, the current thread could be seen in state _thread_blocked by the VMThread when going to a safepoint, even if the actual thread checked do_call_back() prior to the safepoint being initiated. That allows this thread to escape the safepoint. After line 2032 we need to insert a suitable "membar" as is done elsewhere thread state is updated

        Issue 2: From the wait_for/is_ext_suspend_completed code:

               // If the VMThread is trying to synchronize all the threads for a
               // safepoint, then we block ourself on the Threads_lock.
               if (SafepointSynchronize::is_synchronizing()) {
                 Threads_lock->lock();
                 Threads_lock->unlock();
               }

        When we attempt the lock(), if there is a safepoint in progress then we will initiate the safepoint call-back and actually block trying to acquire the Threads_lock inside the lock() method. When the safepoint is over we will eventually acquire the Threads_lock and unlock it; the original lock() request will then proceed and we will lock the Threads_lock again and then proceed with the unlock(). So if there is a safepoint this code causes us to acquire and release the Threads_lock twice.

        In wait_for_ext_suspend_completion a recent change (CR6440070) replaced a yield_all with a monitor wait. That monitor wait can be used to perform the safepoint check directly and avoid the explicit code involving the Threads_lock. Similarly in is_ext_suspend_completed the remaining yield_all can be replaced with the same monitor wait as used for 6440070.

        Attachments

          Issue Links

            Activity

              People

                dholmes David Holmes
                dholmes David Holmes
                Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved:
                  Imported:
                  Indexed: