On Solaris, both Sparc and Intel, I have run into the following problem on
VM shutdown: (which happens more frequently with the suspend/resume
timing changes of taking down daemons):
The current state is that the VMThread has notified other threads that
it is gone and is actually doing the delete this at the end of the
VMThread::run() code. The VMThread destructor calls the HandleMark
destructor which updates the &nof_handlemarks (in the debug version).
This attempts to reference the generated code for atomic::decrement
which is actually os::solaris::atomic_increment_func, which was
set up via generate_atomic_increment. (This stack trace is
for Solaris Intel, the same problem occurs on Solaris Sparc).
The code for atomic_increment_function was at 0xdb4002f1 in
thread 1's memory. I've appended a stack trace from a restart
to show how that got allocated.
Meanwhile thread 1 (main) has completed it's request to take the VM
down (jni_DestroyJavaVM) and is cleaning up via exit(). Part
of this cleanup deletes the CodeHeap, freeing up the memory
where atomic_increment_function resides.
This gives the VMThread a SEGV.
I have seen this running volano w/SafepointALot (took ~ 5 hours) and with
peptest with profiling (took 24 hours, > 3000 times)
I have not (yet?) seen this prior to my suspend/resume changes rolled
to Merlin, but my understanding of the problem looks like it
might be independent of my changes.
Debugger info:
VMThread:
(dbx) where t@4
current thread: t@4
[1] 0xdb4002f1(0xdff5cb9c, 0x1), at 0xdb4002f0
[2] atomic::add(dest = 0xdff5cb9c, add_value = 1), line 17 in "atomic_solar
is.inline.hpp"
[3] os::handle_unexpected_exception(thread = (nil), sig = 11, pc = 0xdb4002
f1 "", extra_info = 0xdf513c4c), line 784 in "os.cpp"
[4] JVM_handle_solaris_signal(sig = 11, info = 0xdf513c4c, ucVoid = 0xdf513
a4c, abort_if_unrecognized = 1), line 916 in "os_solaris_i486.cpp"
[5] signalHandler(sig = 11, info = 0xdf513c4c, ucVoid = 0xdf513a4c), line 1
935 in "os_solaris.cpp"
[6] __sighndlr(0xb, 0xdf513c4c, 0xdf513a4c, 0xdfae5414), at 0xdf7b92b3
[7] sigacthandler(), at 0xdf7c60f7
---- called from signal handler with signal 11 (SIGSEGV) ------
[8] 0xdb4002f1(0xdff51f2c, 0x0), at 0xdb4002f0
[9] atomic::decrement(dest = 0xdff51f2c), line 25 in "atomic_solaris.inline
.hpp"
[10] HandleMark::~HandleMark(this = 0x804dd10), line 129 in "handles.cpp"
[11] Thread::~Thread(this = 0x80e4ad0), line 104 in "thread.cpp"
[12] VMThread::~VMThread(0x80e4ad0), at 0xdfbda8e6
[13] __SLIP.DELETER__A(0x80e4ad0, 0x1), at 0xdfbda796
[14] VMThread::run(this = 0x80e4ad0), line 192 in "vmThread.cpp"
[15] _start(data = 0x80e4ad0), line 494 in "os_solaris.cpp"
current atomic_increment_function:
(dbx) x 0xdb4002f1/4 i
0xdb4002f1: <bad address 0xdb4002f1>
0x00000000: <bad address 0x0>
0x00000000: <bad address 0x0>
0x00000000: <bad address 0x0>
(dbx) where t@1
current thread: t@1
[1] _munmap(0xdb400000, 0x2000000), at 0xdf6c45a8
[2] os::release_memory(addr = 0xdb400000 "", bytes = 33554432U), line 1388
in "os_solaris.cpp"
[3] VirtualSpace::release(this = 0xdff4a09c), line 154 in "virtualspace.cpp
"
[4] VirtualSpace::~VirtualSpace(this = 0xdff4a09c), line 149 in "virtualspa
ce.cpp"
[5] CodeHeap::~CodeHeap(0xdff4a09c), at 0xdf8dce87
[6] __STATIC_DESTRUCTOR(), line 0 in "codeCache.cpp"
[7] _fini(), at 0xdfdd4e2a
[8] _exithandle(), at 0xdf6b283c
[9] exit(), at 0xdf711882
Appendix 1: Another run with Thread 1 setting up the atomic_increment_functio
n
code:
current thread: t@1
=>[1] os::Solaris::atomic_increment_bootstrap(inc = 1, loc = 0xdff4da9c), lin
e 2627 in "os_solaris.cpp"
[2] atomic::increment(dest = 0xdff4da9c), line 21 in "atomic_solaris.inline
.hpp"
[3] GC_locker::lock(), line 18 in "gcLocker.inline.hpp"
[4] universe_init(), line 400 in "universe.cpp"
[5] init_globals(), line 131 in "init.cpp"
[6] Threads::create_vm(args = 0x8046c94), line 2079 in "thread.cpp"
[7] JNI_CreateJavaVM(vm = 0x8046d04, penv = 0x8046d00, args = 0x8046c94), l
ine 2108 in "jni.cpp"
[8] InitializeJVM(pvm = 0x8046d04, penv = 0x8046d00, ifn = 0x8046cdc), line
465 in "java.c"
[9] main(argc = 2, argv = 0x8046d44), line 173 in "java.c"
original atomic_increment_entry code:
(dbx) x 0xdb4002f1/20 i
0xdb4002f1: pushl %edx
0xdb4002f2: pushl %ecx
0xdb4002f3: movl 12(%esp,1),%eax
0xdb4002f7: movl 16(%esp,1),%edx
0xdb4002fb: movl %eax,%ecx
0xdb4002fd: bad opcode
0xdb400300: addb (%ebx),%al
0xdb400302: rcrl $0xc3,90(%ecx)
0xdb400306: pushl %ebp
0xdb400307: movl %esp,%ebp
0xdb400309: movl 0(%ebp),%eax
0xdb40030c: movl (%eax),%eax
0xdb40030e: popl %ebp
0xdb40030f: ret
Appendix 2: Another run with Thread 1 caught earlier for the
Threads::destroy_vm() request which I presume triggered all of this exiting.
(dbx) where t@1
current thread: t@1
[1] __lwp_sema_wait(0x804cdd0), at 0xdf6c6479
[2] _park(), at 0xdf7bac55
[3] _swtch(), at 0xdf7baa82
[4] _cond_wait_cancel(0x8060730, 0x8060718), at 0xdf7b9aa5
[5] os::Solaris::cond_wait(cv = 0x8060730, mx = 0x8060718), line 178 in "os
_solaris.hpp"
[6] os::Solaris::Event::wait(this = 0x8060710), line 303 in "os_solaris.hpp
"
[7] Monitor::wait(this = 0x80606a8, no_safepoint_check = 0, timeout = 0), l
ine 173 in "mutex_solaris.cpp"
[8] VMThread::execute(op = 0x8046c30), line 396 in "vmThread.cpp"
[9] Threads::destroy_vm(), line 2334 in "thread.cpp"
[10] jni_DestroyJavaVM(vm = 0xdff55c98), line 2154 in "jni.cpp"
[11] main(argc = 2, argv = 0x8046d44), line 282 in "java.c"
VM shutdown: (which happens more frequently with the suspend/resume
timing changes of taking down daemons):
The current state is that the VMThread has notified other threads that
it is gone and is actually doing the delete this at the end of the
VMThread::run() code. The VMThread destructor calls the HandleMark
destructor which updates the &nof_handlemarks (in the debug version).
This attempts to reference the generated code for atomic::decrement
which is actually os::solaris::atomic_increment_func, which was
set up via generate_atomic_increment. (This stack trace is
for Solaris Intel, the same problem occurs on Solaris Sparc).
The code for atomic_increment_function was at 0xdb4002f1 in
thread 1's memory. I've appended a stack trace from a restart
to show how that got allocated.
Meanwhile thread 1 (main) has completed it's request to take the VM
down (jni_DestroyJavaVM) and is cleaning up via exit(). Part
of this cleanup deletes the CodeHeap, freeing up the memory
where atomic_increment_function resides.
This gives the VMThread a SEGV.
I have seen this running volano w/SafepointALot (took ~ 5 hours) and with
peptest with profiling (took 24 hours, > 3000 times)
I have not (yet?) seen this prior to my suspend/resume changes rolled
to Merlin, but my understanding of the problem looks like it
might be independent of my changes.
Debugger info:
VMThread:
(dbx) where t@4
current thread: t@4
[1] 0xdb4002f1(0xdff5cb9c, 0x1), at 0xdb4002f0
[2] atomic::add(dest = 0xdff5cb9c, add_value = 1), line 17 in "atomic_solar
is.inline.hpp"
[3] os::handle_unexpected_exception(thread = (nil), sig = 11, pc = 0xdb4002
f1 "", extra_info = 0xdf513c4c), line 784 in "os.cpp"
[4] JVM_handle_solaris_signal(sig = 11, info = 0xdf513c4c, ucVoid = 0xdf513
a4c, abort_if_unrecognized = 1), line 916 in "os_solaris_i486.cpp"
[5] signalHandler(sig = 11, info = 0xdf513c4c, ucVoid = 0xdf513a4c), line 1
935 in "os_solaris.cpp"
[6] __sighndlr(0xb, 0xdf513c4c, 0xdf513a4c, 0xdfae5414), at 0xdf7b92b3
[7] sigacthandler(), at 0xdf7c60f7
---- called from signal handler with signal 11 (SIGSEGV) ------
[8] 0xdb4002f1(0xdff51f2c, 0x0), at 0xdb4002f0
[9] atomic::decrement(dest = 0xdff51f2c), line 25 in "atomic_solaris.inline
.hpp"
[10] HandleMark::~HandleMark(this = 0x804dd10), line 129 in "handles.cpp"
[11] Thread::~Thread(this = 0x80e4ad0), line 104 in "thread.cpp"
[12] VMThread::~VMThread(0x80e4ad0), at 0xdfbda8e6
[13] __SLIP.DELETER__A(0x80e4ad0, 0x1), at 0xdfbda796
[14] VMThread::run(this = 0x80e4ad0), line 192 in "vmThread.cpp"
[15] _start(data = 0x80e4ad0), line 494 in "os_solaris.cpp"
current atomic_increment_function:
(dbx) x 0xdb4002f1/4 i
0xdb4002f1: <bad address 0xdb4002f1>
0x00000000: <bad address 0x0>
0x00000000: <bad address 0x0>
0x00000000: <bad address 0x0>
(dbx) where t@1
current thread: t@1
[1] _munmap(0xdb400000, 0x2000000), at 0xdf6c45a8
[2] os::release_memory(addr = 0xdb400000 "", bytes = 33554432U), line 1388
in "os_solaris.cpp"
[3] VirtualSpace::release(this = 0xdff4a09c), line 154 in "virtualspace.cpp
"
[4] VirtualSpace::~VirtualSpace(this = 0xdff4a09c), line 149 in "virtualspa
ce.cpp"
[5] CodeHeap::~CodeHeap(0xdff4a09c), at 0xdf8dce87
[6] __STATIC_DESTRUCTOR(), line 0 in "codeCache.cpp"
[7] _fini(), at 0xdfdd4e2a
[8] _exithandle(), at 0xdf6b283c
[9] exit(), at 0xdf711882
Appendix 1: Another run with Thread 1 setting up the atomic_increment_functio
n
code:
current thread: t@1
=>[1] os::Solaris::atomic_increment_bootstrap(inc = 1, loc = 0xdff4da9c), lin
e 2627 in "os_solaris.cpp"
[2] atomic::increment(dest = 0xdff4da9c), line 21 in "atomic_solaris.inline
.hpp"
[3] GC_locker::lock(), line 18 in "gcLocker.inline.hpp"
[4] universe_init(), line 400 in "universe.cpp"
[5] init_globals(), line 131 in "init.cpp"
[6] Threads::create_vm(args = 0x8046c94), line 2079 in "thread.cpp"
[7] JNI_CreateJavaVM(vm = 0x8046d04, penv = 0x8046d00, args = 0x8046c94), l
ine 2108 in "jni.cpp"
[8] InitializeJVM(pvm = 0x8046d04, penv = 0x8046d00, ifn = 0x8046cdc), line
465 in "java.c"
[9] main(argc = 2, argv = 0x8046d44), line 173 in "java.c"
original atomic_increment_entry code:
(dbx) x 0xdb4002f1/20 i
0xdb4002f1: pushl %edx
0xdb4002f2: pushl %ecx
0xdb4002f3: movl 12(%esp,1),%eax
0xdb4002f7: movl 16(%esp,1),%edx
0xdb4002fb: movl %eax,%ecx
0xdb4002fd: bad opcode
0xdb400300: addb (%ebx),%al
0xdb400302: rcrl $0xc3,90(%ecx)
0xdb400306: pushl %ebp
0xdb400307: movl %esp,%ebp
0xdb400309: movl 0(%ebp),%eax
0xdb40030c: movl (%eax),%eax
0xdb40030e: popl %ebp
0xdb40030f: ret
Appendix 2: Another run with Thread 1 caught earlier for the
Threads::destroy_vm() request which I presume triggered all of this exiting.
(dbx) where t@1
current thread: t@1
[1] __lwp_sema_wait(0x804cdd0), at 0xdf6c6479
[2] _park(), at 0xdf7bac55
[3] _swtch(), at 0xdf7baa82
[4] _cond_wait_cancel(0x8060730, 0x8060718), at 0xdf7b9aa5
[5] os::Solaris::cond_wait(cv = 0x8060730, mx = 0x8060718), line 178 in "os
_solaris.hpp"
[6] os::Solaris::Event::wait(this = 0x8060710), line 303 in "os_solaris.hpp
"
[7] Monitor::wait(this = 0x80606a8, no_safepoint_check = 0, timeout = 0), l
ine 173 in "mutex_solaris.cpp"
[8] VMThread::execute(op = 0x8046c30), line 396 in "vmThread.cpp"
[9] Threads::destroy_vm(), line 2334 in "thread.cpp"
[10] jni_DestroyJavaVM(vm = 0xdff55c98), line 2154 in "jni.cpp"
[11] main(argc = 2, argv = 0x8046d44), line 282 in "java.c"
- duplicates
-
JDK-4412040 one of eight VMs working in parallel crashes
-
- Closed
-
- relates to
-
JDK-4422213 java_g -version throws Segmentation Fault in Linux
-
- Closed
-