Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-2047026 | 1.4.1 | Sundararajan Athijegannathan | P4 | Closed | Fixed | beta |
JDK-2047025 | 1.4.0_02 | Sundararajan Athijegannathan | P4 | Closed | Fixed | 02 |
============================= Problem ===========================
customers description:
Java Virtual Machine 1.3.1-b24 running on Sun Solaris 8 on an Enterprise 10000 machine 16 cpu's.
We have JNI involved in our application, we are using CORBA as communication protocol and a sybase jdbc driver.
We are already using the export LD_LIBRARY_PATH=/usr/lib/lwp on all system for
a long time.
Yes we have mixed classes (1.2.2_05a and 1.3.1).
The application is using 1800 Mb of memory and around 1600 threads are used.
In our production environment our application server crashed. A core file and pstack file are available. After investigating these files we did not found any info why or what caused the server to crash. We also got an error log from the console generated from the Virtual Machine. This error is for the moment our starting point and we are hoping that Sun can support us in tracking down the problem and give us some info on this error id.
This is the error from the console.
#
# HotSpot Virtual Machine Error, Internal Error
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Error ID: 455843455054494F4E530E43505000CD 01
#
# Problematic Thread: prio=5 tid=0x1c9b7d8 nid=0x53c runnable
#
======================== dbx stacktrace =======================
Here is a dbx stacktrace:
Reading java
core file header read successfully
Reading ld.so.1
Reading libthread.so.1
Reading libdl.so.1
Reading libc.so.1
Reading libc_psr.so.1
Reading libjvm.so
Reading libCrun.so.1
Reading libsocket.so.1
Reading libnsl.so.1
Reading libm.so.1
Reading libw.so.1
Reading libmp.so.2
Reading libhpi.so
Reading libverify.so
Reading libjava.so
Reading libzip.so
Reading en_US.UTF-8.so.2
Reading methods_en_US.UTF-8.so.2
Reading libUtility.so
Reading libpthread.so.1
Reading librt.so.1
Reading libaio.so.1
Reading libnet.so
Reading nss_files.so.1
Reading libioser12.so
Reading libGEDWrapper0.so
Reading libtls7d.so
Reading libmth7d.so
Reading libbla7d.so
Reading libnagc.so.6
Reading libintl.so.1
Reading libucb.so.1
Reading libresolv.so.2
Reading libelf.so.1
Reading libGEDWrapper9.so
Reading libGEDWrapper3.so
Reading libGEDWrapper5.so
Reading libGEDWrapper6.so
Reading libGEDWrapper2.so
Reading libGEDWrapper1.so
Reading libGEDWrapper10.so
Reading libGEDWrapper7.so
Reading libGEDWrapper4.so
Reading libGEDWrapper8.so
Reading libRTWrapperDLL.so
Reading libRTContribWrapperDLL.so
detected a multithreaded program
t@1340 (l@1340) terminated by signal ABRT (Abort)
0xff3196f8: __lwp_kill+0x0008: bgeu,a __lwp_kill+0x1c
(/opt/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) where
current thread: t@1340
=>[1] __lwp_kill(0x0, 0x53c, 0x0, 0xff336000, 0x19798, 0xff2c9b00), at 0xff3196f8
[2] raise(0x6, 0x0, 0x0, 0xffffffff, 0xff33a394, 0xc), at 0xff2c9b08
[3] abort(0xff336000, 0x5f57e648, 0x0, 0x4, 0x0, 0x5f57e669), at 0xff2b5124
[4] os::abort(0x1, 0xff092000, 0x1, 0x5f57e, 0xff092000, 0x5f57e664), at 0xfefa019c
[5] report_error(0xe4, 0x5f57eee4, 0xcd, 0xff028208, 0xff0ffddc, 0xff092000), at 0xfef0f904
[6] report_fatal(0xcd, 0xff092000, 0xff02874c, 0x5f57f, 0xff092000, 0x5f57f824), at 0xfef0f1d
4
[7] ExceptionMark::ExceptionMark(0x886e2708, 0x5f57f8f8, 0xff092000, 0xf74000c0, 0xff092000,
0x5f57f884), at 0xfecf2540
[8] constantPoolOopDesc::klass_at_if_loaded(0x0, 0x0, 0x5f57f98c, 0xf740f378, 0xff092000, 0x5
f57f914), at 0xfecf0f84
[9] methodOopDesc::fast_exception_handler_bci_for(0x1c9b7d8, 0xf7e13070, 0xf7e13538, 0x5f57fa
68, 0x7b, 0x5f57f99c), at 0xfed5dbb8
[10] InterpreterRuntime::exception_handler_for_exception(0x1c9b7d8, 0xf74373f8, 0x5f57fc48, 0
xff092000, 0x1c9b7d8, 0x109a0), at 0xfed5d670
[11] 0x123860(0x5f57fbdc, 0x1, 0xff09fa58, 0x12d944, 0x8, 0x5f57fae8), at 0x12385f
[12] 0xff0f9968(0x5f57fc68, 0x5f57fea0, 0xa, 0xf7e13598, 0x4, 0x5f57fb80), at 0xff0f9967
[13] JavaCalls::call_helper(0x5f57fe98, 0xff092000, 0x5f57fde4, 0x1c9b7d8, 0x123d78, 0x5f57fe
a0), at 0xfecc67ec
[14] JavaCalls::call_virtual(0xf7e136d8, 0x5f57fdd0, 0x5f57fdd4, 0xff092000, 0x5f57fe98, 0x5f
57fde4), at 0xfedf0e18
[15] JavaCalls::call_virtual(0x5f57fe98, 0x5f57fe94, 0x5f57fe90, 0x5f57fe84, 0x5f57fe7c, 0x1c
9b7d8), at 0xfedf6dec
[16] thread_entry(0xf7417e18, 0x1c9b7d8, 0xff092000, 0x5f57ffa0, 0x1e, 0xe), at 0xfee14ba4
[17] JavaThread::run(0x5f500000, 0xff09cf3c, 0xff092000, 0x80000, 0x1c9b7d8, 0x80000), at 0xf
ee0f6a4
[18] _start(0xff092000, 0x5f580000, 0x0, 0x0, 0x0, 0x0), at 0xfee0d410
a pstack and pmap are also available:
http://cores.germany/cgi/content.pl?file=/cores/CA_36376145/crash-1/pstack
http://cores.germany/cgi/content.pl?file=/cores/CA_36376145/crash-1/pmap.prod
there's also a dbxoutput, pstack,pmap,env and email from a second crash:
http://cores.germany/cgi/content.pl?file=/cores/CA_36376145/crash-2/dbx-stack-trace-2
http://cores.germany/cgi/content.pl?file=/cores/CA_36376145/crash-2/pstack-2
http://cores.germany/cgi/content.pl?file=/cores/CA_36376145/crash-2/pmap-2
http://cores.germany/cgi/content.pl?file=/cores/CA_36376145/crash-2/environment
http://cores.germany/cgi/content.pl?file=/cores/CA_36376145/crash-2/email
to see the complete explorer data see:
http://cores.germany/cgi/view.pl
and enter case id 36376145
============================ jvm Options ===========================
JAVA_OPTIONS=-server -verbose:gc -Xnoclassgc -Xms1800m -Xmx1800m
============================= Patches ========================
patches for 1.3.1 on Solaris 8:
required: installed:
108652-33 Xsun Patch -
108921-12 dtwm Patch 108921-07
108940-24 motif Patch 108940-12
- I don't think that they need these patches as they use the -server option.
or is X11 still used with -server?
============================= Analysis ==========================
an analysis from my colleague Kevin Walls:
Hi Peter,
EXCEPTIONS.CPP line 205 is what the ascii string in hex error message decodes
to. that line says:
205 fatal("ExceptionMark constructor expects no pending exceptions");
..which suggests to me that it's handling an exception, and sees that one is
already pending.
In the pstack of lwp 1340 which aborts:
(incidentally - are they using the alternate thread library in lib/lwp? That
would be my guess as why lwp/thread numbers all seem to match! Not sure if
it's relevant or if they should be using it - could be a useful comparison
good/bad?)
.
.
.
fed5d670
__1cSInterpreterRuntimebFexception_handler_for_exception6FpnKJavaThread_pnHoopD
esc__pC_ (1c9b7d8, f74373f8, 5f57fc48, ff092000, 1c9b7d8, 109a0) + 338
00123860 ???????? (5f57fbdc, 1, ff09fa58, 12d944, 8, 5f57fae8)
ff0f9968 __1cMStubRoutinesG_code1_ (5f57fc68, 5f57fea0, a, f7e13598, 4,
5f57fb80) + 3e8
fecc67ec
__1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_p
nGThread__v_ (5f57fe98, ff092000, 5f57fde4, 1c9b7d8, 123d78, 5f57fea0) + 308
.
.
.
The middle line here ??????? - this is probably in the program text so I guess
it's their native app? pmap would prove this.
If it is their program, then it's then calling into: (running
/opt/SUNWspro/bin/dem on the symbol in the pstack):
unsigned
char*InterpreterRuntime::exception_handler_for_exception(JavaThread*,oopDesc*)
..which seems to be what works out the continuation address when an exception
happens (but from above, one was already pending I suppose, hence the abort).
If that was true it could be their own fault: however, I think the call_helper
routine is all java land execution stuff, so the ????? address may be on the
heap (again, pstack will prove it - pstack on the core should be enough to show
if it's the heap?).
So the ?????? may be compiled code:
java -Xint would run in interpreted, no-hotspot mode. I hope the problem is
reproducible and they have a test environment!
Ah - from:
http://cheesypoof.uk/lxr/source/hotspot1.3.1/src/share/vm/runtime/stubRoutines.
hpp?p=Java_1.3.1
17 // StubRoutines provides entry points to assembly routines used by
18 // compiled code and the run-time system. Platform-specific entry
19 // points are defined in the platform-specific inner class.
20 //
..so it IS compiled code!
Maybe we've got a new problem statement:
compiled code is generating an exception while one is already pending
(with it to be confirmed if non-compiled code has the same problem)
I'm really not sure if that's any help at all, but it was interesting!
Kevin Walls
the address 00123860 hex is within the heap: This is the line in pmap:
00026000 39832K read/write/exec [ heap ]
So it's dynamically compiled code as we thought -- ie . hotspot-compiled
code.
=========================================================
I hope that this data is enough for you to find out what happend.
if you need more information then ask me.
###@###.###
- backported by
-
JDK-2047025 java vm 1.3.1-b24 crashes
- Closed
-
JDK-2047026 java vm 1.3.1-b24 crashes
- Closed
- relates to
-
JDK-4530217 java vm crashes in garbage collection
- Closed
-
JDK-4622693 HotSpot VM Error 11; Error ID: 4F530E43505002C4 01; jdk1.3.1_01 client mixed
- Closed
-
JDK-4854693 [1.3.1_06] JVM crashes in share/vm/utilities/exceptions.cpp, [mantis rc] hang
- Closed