Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4423824

SEGV on vm exit - intermittent/timing related

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P3 P3
    • 1.4.0
    • 1.4.0
    • hotspot
    • None
    • beta
    • sparc
    • solaris_2.6, solaris_7

      On Solaris, both Sparc and Intel, I have run into the following problem on
      VM shutdown: (which happens more frequently with the suspend/resume
      timing changes of taking down daemons):



      The current state is that the VMThread has notified other threads that
      it is gone and is actually doing the delete this at the end of the
      VMThread::run() code. The VMThread destructor calls the HandleMark
      destructor which updates the &nof_handlemarks (in the debug version).

      This attempts to reference the generated code for atomic::decrement
      which is actually os::solaris::atomic_increment_func, which was
      set up via generate_atomic_increment. (This stack trace is
      for Solaris Intel, the same problem occurs on Solaris Sparc).

      The code for atomic_increment_function was at 0xdb4002f1 in
      thread 1's memory. I've appended a stack trace from a restart
      to show how that got allocated.

      Meanwhile thread 1 (main) has completed it's request to take the VM
      down (jni_DestroyJavaVM) and is cleaning up via exit(). Part
      of this cleanup deletes the CodeHeap, freeing up the memory
      where atomic_increment_function resides.

      This gives the VMThread a SEGV.


      I have seen this running volano w/SafepointALot (took ~ 5 hours) and with
      peptest with profiling (took 24 hours, > 3000 times)

      I have not (yet?) seen this prior to my suspend/resume changes rolled
      to Merlin, but my understanding of the problem looks like it
      might be independent of my changes.

      Debugger info:

      VMThread:

      (dbx) where t@4
      current thread: t@4
        [1] 0xdb4002f1(0xdff5cb9c, 0x1), at 0xdb4002f0
        [2] atomic::add(dest = 0xdff5cb9c, add_value = 1), line 17 in "atomic_solar
      is.inline.hpp"
        [3] os::handle_unexpected_exception(thread = (nil), sig = 11, pc = 0xdb4002
      f1 "", extra_info = 0xdf513c4c), line 784 in "os.cpp"
        [4] JVM_handle_solaris_signal(sig = 11, info = 0xdf513c4c, ucVoid = 0xdf513
      a4c, abort_if_unrecognized = 1), line 916 in "os_solaris_i486.cpp"
        [5] signalHandler(sig = 11, info = 0xdf513c4c, ucVoid = 0xdf513a4c), line 1
      935 in "os_solaris.cpp"
        [6] __sighndlr(0xb, 0xdf513c4c, 0xdf513a4c, 0xdfae5414), at 0xdf7b92b3
        [7] sigacthandler(), at 0xdf7c60f7
        ---- called from signal handler with signal 11 (SIGSEGV) ------
        [8] 0xdb4002f1(0xdff51f2c, 0x0), at 0xdb4002f0
        [9] atomic::decrement(dest = 0xdff51f2c), line 25 in "atomic_solaris.inline
      .hpp"
        [10] HandleMark::~HandleMark(this = 0x804dd10), line 129 in "handles.cpp"
        [11] Thread::~Thread(this = 0x80e4ad0), line 104 in "thread.cpp"
        [12] VMThread::~VMThread(0x80e4ad0), at 0xdfbda8e6
        [13] __SLIP.DELETER__A(0x80e4ad0, 0x1), at 0xdfbda796
        [14] VMThread::run(this = 0x80e4ad0), line 192 in "vmThread.cpp"
        [15] _start(data = 0x80e4ad0), line 494 in "os_solaris.cpp"

      current atomic_increment_function:

      (dbx) x 0xdb4002f1/4 i
      0xdb4002f1: <bad address 0xdb4002f1>
      0x00000000: <bad address 0x0>
      0x00000000: <bad address 0x0>
      0x00000000: <bad address 0x0>

      (dbx) where t@1
      current thread: t@1
        [1] _munmap(0xdb400000, 0x2000000), at 0xdf6c45a8
        [2] os::release_memory(addr = 0xdb400000 "", bytes = 33554432U), line 1388
      in "os_solaris.cpp"
        [3] VirtualSpace::release(this = 0xdff4a09c), line 154 in "virtualspace.cpp
      "
        [4] VirtualSpace::~VirtualSpace(this = 0xdff4a09c), line 149 in "virtualspa
      ce.cpp"
        [5] CodeHeap::~CodeHeap(0xdff4a09c), at 0xdf8dce87
        [6] __STATIC_DESTRUCTOR(), line 0 in "codeCache.cpp"
        [7] _fini(), at 0xdfdd4e2a
        [8] _exithandle(), at 0xdf6b283c
        [9] exit(), at 0xdf711882



      Appendix 1: Another run with Thread 1 setting up the atomic_increment_functio
      n
      code:
      current thread: t@1
      =>[1] os::Solaris::atomic_increment_bootstrap(inc = 1, loc = 0xdff4da9c), lin
      e 2627 in "os_solaris.cpp"
        [2] atomic::increment(dest = 0xdff4da9c), line 21 in "atomic_solaris.inline
      .hpp"
        [3] GC_locker::lock(), line 18 in "gcLocker.inline.hpp"
        [4] universe_init(), line 400 in "universe.cpp"
        [5] init_globals(), line 131 in "init.cpp"
        [6] Threads::create_vm(args = 0x8046c94), line 2079 in "thread.cpp"
        [7] JNI_CreateJavaVM(vm = 0x8046d04, penv = 0x8046d00, args = 0x8046c94), l
      ine 2108 in "jni.cpp"
        [8] InitializeJVM(pvm = 0x8046d04, penv = 0x8046d00, ifn = 0x8046cdc), line
       465 in "java.c"
        [9] main(argc = 2, argv = 0x8046d44), line 173 in "java.c"


      original atomic_increment_entry code:

      (dbx) x 0xdb4002f1/20 i
      0xdb4002f1: pushl %edx
      0xdb4002f2: pushl %ecx
      0xdb4002f3: movl 12(%esp,1),%eax
      0xdb4002f7: movl 16(%esp,1),%edx
      0xdb4002fb: movl %eax,%ecx
      0xdb4002fd: bad opcode
      0xdb400300: addb (%ebx),%al
      0xdb400302: rcrl $0xc3,90(%ecx)
      0xdb400306: pushl %ebp
      0xdb400307: movl %esp,%ebp
      0xdb400309: movl 0(%ebp),%eax
      0xdb40030c: movl (%eax),%eax
      0xdb40030e: popl %ebp
      0xdb40030f: ret


      Appendix 2: Another run with Thread 1 caught earlier for the
      Threads::destroy_vm() request which I presume triggered all of this exiting.

      (dbx) where t@1
      current thread: t@1
        [1] __lwp_sema_wait(0x804cdd0), at 0xdf6c6479
        [2] _park(), at 0xdf7bac55
        [3] _swtch(), at 0xdf7baa82
        [4] _cond_wait_cancel(0x8060730, 0x8060718), at 0xdf7b9aa5
        [5] os::Solaris::cond_wait(cv = 0x8060730, mx = 0x8060718), line 178 in "os
      _solaris.hpp"
        [6] os::Solaris::Event::wait(this = 0x8060710), line 303 in "os_solaris.hpp
      "
        [7] Monitor::wait(this = 0x80606a8, no_safepoint_check = 0, timeout = 0), l
      ine 173 in "mutex_solaris.cpp"
        [8] VMThread::execute(op = 0x8046c30), line 396 in "vmThread.cpp"
        [9] Threads::destroy_vm(), line 2334 in "thread.cpp"
        [10] jni_DestroyJavaVM(vm = 0xdff55c98), line 2154 in "jni.cpp"
        [11] main(argc = 2, argv = 0x8046d44), line 282 in "java.c"


            acorn Karen Kinnear (Inactive)
            acorn Karen Kinnear (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: