Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4901065

[1.3.1_03] Hotspot terminates during safepoint process

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: P2 P2
    • None
    • 1.3.1_03
    • hotspot
    • sparc
    • solaris_8

      Hotspot crash occurs at end users's site.
      This crash seems to occur when hotspot detects incosistency status
      inside hotspot during safepoint process.

      Unfortunately, they can not create test cases becasue this occurs
      during their big system run.
      They sent their report for core file.(Please see the "INVESTIGATION")


      CONFIGRATION:

       - OS : Solaris 8
       - MPU : UltraSparc-II 450 [MHz] * 4
       - Memmory : 4096[MB]
       - JDK : 1.3.1_03
       
      LOG file:

      #
      # Fatal: Deadlock in safepoint code. stopped at 0x0
      #
      # Error ID: 53414645504F494E540E435050010A [ Patched ]

      It means "share.vm.runtime.safepoint.cpp, 266"


      STACK TRACE:
      .....
       [ 1] libthread.so.1:__sigprocmask+0x8 jmp %o7 + 0x8
       [ 2] libthread.so.1:_sigon+0xd0 call libthread.so.1:_resetsig (ff35e638)
       [ 3] libthread.so.1:_thrp_kill+0xf8 call libthread.so.1:_lmutex_unlock (ff35b2b8)
       [ 4] libc.so.1:raise+0x40 call _thr_kill (ff2bdc04)
       [ 5] libc.so.1:abort+0x100 call raise (ff2bd3dc)
       [ 6] libjvm.so:void os::abort(long)+0xc8 call abort (fe4e5fc8)
       [ 7] libjvm.so:void report_error(long,const char*,int,const char*,const char*,...)+0x50c call
       libjvm.so:void os::abort(long) (fe3c9d64)
       [ 8] libjvm.so:void report_fatal(const char*,int,const char*,...)+0x60 call libjvm.so:void
       report_error(long,const char*,int,const char*,const char*,...) (fe3322a0)
       [ 9] libjvm.so:void SafepointSynchronize::block(JavaThread*)+0x1c0 call libjvm.so:void
       report_fatal(const char*,int,const char*,...) (fe331ef8)
       [10] libjvm.so:jni_FindClass+0x50 call libjvm.so:void SafepointSynchronize::block(JavaThread*)
       (fe182864)
       [11] libODjavasv2.so:Call_Java+0x76c jmpl %l2, %o7
      .....



      INVESTIGATION:

      Please see the following 2 source files.(interfaceSupport.hpp and safepoint.cpp)

      ==== Source code ====

      - ./src/share/vm/runtime/interfaceSupport.hpp

      ...
      Line#81:
        static inline void transition(JavaThread *thread, JavaThreadState from, JavaThreadState to) {
          assert(from != _thread_in_Java, "use transition_from_java");
          assert((from & 1) == 0 && (to & 1) == 0, "odd numbers are transitions states");
          assert(thread->thread_state() == from, "coming from wrong thread state");
          // Change to transition state (assumes total store ordering! -Urs)
          thread->set_thread_state((JavaThreadState)(from + 1)); <------ (1)
          if (SafepointSynchronize::do_call_back()) SafepointSynchronize::block(thread);
                                                                      <------ (2)
          thread->set_thread_state(to);
        }

      ....
      line#93:
         void trans(JavaThreadState from, JavaThreadState to) { transition(_thread, from, to); }
         
       ... <------ (3)
      line#144:
        ThreadInVMfromNative(JavaThread* thread) : ThreadStateTransition(thread) {
          trans(_thread_in_native, _thread_in_vm); <------ (4)
        }

      ....
      line#367:
      #define JNI_ENTRY(result_type, header) <------(5) extern "C" { static result_type JNICALL header { JavaThread* thread=JavaThread::thread_from_jni_environment(env); assert( !VerifyJNIEnvThread || (thread == Thread::current()), "JNIEnv is only valid in same
      thread"); ThreadInVMfromNative __tiv(thread); debug_only(VMNativeEntryWrapper __vew;) __ENTRY(result_type, header, thread)
      ....


      -./src/share/vm/runtime/safepoint.cpp

      line#214:
      void SafepointSynchronize::block(JavaThread *thread) {

          case _thread_in_native_trans:
          case _thread_blocked_trans:
          case _thread_new_trans:
            if (thread->safepoint_state()->type() == ThreadSafepointState::_call_back) {
                                                                <------(6)
                                                                
              address stop_pc = thread->safepoint_state()->_stop_pc;
              InterpreterCodelet* icd = Interpreter::codelet_containing(stop_pc);
              if (icd != NULL) {
                icd->print();
                fatal("Wrong safepoint info in interpreter");
              } else {
                fatal1("Deadlock in safepoint code. stopped at 0x%x", stop_pc);
              }
            }

      ===== Source code end ====

      Because we can not find how to reproduce this issue definitely,
      we look into the core closely.
      We consider the follwoing scenario.

      1) The progrma enters into jni_FindClass.(See the stack trace [10])
      2) When the program is changing a thread status from _thread_in_native to
         _thread_in_vm(See the above (1) - (4)), the request to stop at safepoint
         (GC caused from OutOfMemory seems to request) occurs.
      3) Then, while ThreadSafepointState::examine_state_of_thread in SafepointSynchronize::begin()
         is running, thread->safepoint_state()->type() becomes ThreadSafepointState::_call_back,
         which is, if-statement (6) becomes true because some events which changes thread status
         happens somewhere.
      4) As a result, hostspot detects thread status inconsistency. So, program stops.


      ===========================================================================

            minqi Yumin Qi
            tbaba Tadayuki Baba (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: