-
Bug
-
Resolution: Cannot Reproduce
-
P4
-
None
-
1.3.1_03
-
sparc
-
solaris_8
category : java
release : 1.3.1
subcategory : native_interface
type : bug
synopsis : C++ program using JNI to make RMI calls to server encounters JVM issue
description : FULL PRODUCT VERSION :
java version "1.3.1_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_02-b02)
Java HotSpot(TM) Client VM (build 1.3.1_02-b02, mixed mode)
java -version
java version "1.3.1_03"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_03-b03)
Java HotSpot(TM) Client VM (build 1.3.1_03-b03, mixed mode)
FULL OPERATING SYSTEM VERSION :
SunOS devblade 5.8 Generic_108528-13 sun4u sparc SUNW,Sun-
Blade-100
ADDITIONAL OPERATING SYSTEMS :
HP-UX 11.00 (using HP's 1.3.1_02 JRE)
EXTRA RELEVANT SYSTEM CONFIGURATION :
- This problem also occurs with J2RE 1.4.0 on Solaris 8
(applied patch cluster on Sun website circa 15.Feb.2002)
- This problem also occurs if using the "alternate"
libthread library for Solaris 8 in /usr/lwp/ (the 1-to-1
library to be included in Solaris 9 by default).
- This problem does NOT appear to occur on Win32.
A DESCRIPTION OF THE PROBLEM :
We have a multi-threaded native (C++) application which
must communicate with a Java server. The interface to the
Java server is an API which uses RMI for communication. We
rely heavily on the ability to perform JNI calls to access
this Java API; those calls then immediately dispatch RMI
remote methods to the server.
We are seeing that under heavy loads - even when there is
only one thread running in the application - that repeated
native calls into Java via JNI which then invoke
corresponding RMI procedures cause the JVM to abort
processing for the given thread during the execution of the
JNI->RMI function calls for that thread. Outside of the
native debugger (dbx), we witness a hang for that
particular thread, or worse a total crash of the
application with a stack trace from the JVM pointing to the
RMI call that has caused a signal and subsequent
termination (given enough time). Consistently, the point
of failure occurs in some RMI method that was invoked by a
JNI call. What is not consistent is the precise method
name where the failure occurs following the start of the
RMI remote method call, except that it is often in a native
method which is part of the JVM implementation (e.g,
java.lang.Class.getName)
Running the program under dbx clearly shows that the
trouble always begins with some JNI call such
as "CallObjectMethod". The most information I can get
about a given crash is when starting the program from one
shell, then attaching to the process with dbx in another
shell, then typing "cont". After several minutes, we see:
------------
t@1 (l@1) signal USR1 (User Signal 1) in (unknown) at
0xfb0091fc
0xfb0091fc: st %g0, [%sp + %g3]
Current function is JNIEnv_::CallObjectMethod
865 result = functions->CallObjectMethodV
(this,obj,methodID,args);
-----------
- This problem seems to occur whenever the (native) RMI
client is on the same host as the RMI server, but not when
the RMI server is on a different host.
- We thought this could have something to do with the RMI
distributed garbage collector, but not too sure about that.
- We assume this has not been addressed in J2RE 1.4.0 as we
tried that for Solaris Sparc and observed the same behavior.
- We think it has to do with JNI->RMI as we replaced the
RMI with straight socket functions and it didn't happen
there, and it doesn't happen with a pure Java RMI client.
** This is possibly the same as Bug ID 4493927, but it is
hard to tell due to the lack of detailed information in
that report.
The below sources are enough to reproduce the problem.
Give it a few minutes when running under dbx Note that
some of the code to manipulate the JNI local/global
references may seem strange - this is because I have
attempted to mimic what happens in our real application
over a length of time. Of course we don't sit in an
infinite loop in our actual application; having several
threads running at once and hitting these types of JNI->RMI
calls regularly causes the bug to occur.
---------------
java -version
java version "1.3.1_03"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_03-b03)
Java HotSpot(TM) Client VM (build 1.3.1_03-b03, mixed mode)
JNI method called
Returned from jenv->CallObjectMethod
JNI method called
Returned from jenv->CallObjectMethod
An unexpected exception has been detected in native code outside the VM.
Unexpected Signal : 11 occurred at PC=0x1a75b8
Function name=(N/A)
Library=(N/A)
NOTE: We are unable to locate the function name symbol for the error
just occurred. Please refer to release documentation for possible
reason and solutions.
Current Java thread:
at java.lang.System.currentTimeMillis(Native Method)
at sun.rmi.transport.tcp.TCPConnection.isDead(TCPConnection.java:155)
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:155)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:78)
at com.on.RMIJNITest.RMIJNITestServer_Stub.getNumberContainer(Unknown Source)
at com.on.RMIJNITest.RMIJNITestClient.getContainer(RMIJNITestClient.java:36)
Dynamic libraries:
0x10000 ./jrmitest_main2
0xff000000 /net/koori/onestop/jdk/1.3.1_03/latest/binaries/solsparc/jre/lib/sparc/libjvm.so
0xff370000 /usr/lib/libpthread.so.1
0xfef00000 /usr/lib/libnsl.so.1
0xfefe0000 /usr/lib/libsocket.so.1
0xfefb0000 /usr/dist/share/forte_dev/SUNWspro/lib/libCrun.so.1
0xff390000 /usr/lib/libdl.so.1
0xfec80000 /usr/dist/share/forte_dev/SUNWspro/lib/libCstd.so.1
0xfeec0000 /usr/lib/libm.so.1
0xfeef0000 /usr/lib/libw.so.1
0xfec40000 /usr/lib/libthread.so.1
0xfeb00000 /usr/lib/libc.so.1
0xfee80000 /usr/lib/libmp.so.2
0xfeea0000 /usr/platform/SUNW,Sun-Blade-1000/lib/libc_psr.so.1
0xfebf0000 /usr/dist/share/forte_dev,v6.2/SUNWspro/WS6U2/usr/lib/cpu/sparcv8plus/libCstd.so.1
0xfeac0000 /net/koori.sfbay/a/v01/jdk/1.3.1_03/fcs/binaries/solsparc/jre/lib/sparc/native_threads/libhpi.so
0xfea90000 /net/koori.sfbay/a/v01/jdk/1.3.1_03/fcs/binaries/solsparc/jre/lib/sparc/libverify.so
0xfea40000 /net/koori.sfbay/a/v01/jdk/1.3.1_03/fcs/binaries/solsparc/jre/lib/sparc/libjava.so
0xfea10000 /net/koori.sfbay/a/v01/jdk/1.3.1_03/fcs/binaries/solsparc/jre/lib/sparc/libzip.so
0xfe890000 /net/koori.sfbay/a/v01/jdk/1.3.1_03/fcs/binaries/solsparc/jre/lib/sparc/libnet.so
0xfe870000 /usr/lib/nss_nis.so.1
Local Time = Tue Apr 2 12:23:41 2002
Elapsed Time = 517
#
# The exception above was detected in native code outside the VM
#
# Java VM: Java HotSpot(TM) Client VM (1.3.1_03-b03 mixed mode)
#
# An error report file has been saved as hs_err_pid1548.log.
# Please refer to the file for further information.
#
Abort
detected a multithreaded program
Attached to process 1548 with 9 LWPs
t@0 (l@1) stopped in ___lwp_cond_wait at 0xfeb9c07c
0xfeb9c07c: ___lwp_cond_wait+0x0004: ta 0x8
(/usr/dist/share/forte_dev/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) cont
t@1 (l@6) signal USR1 (User Signal 1) in (unknown) at 0xfac38df0
0xfac38df0: sethi %hi(0xffffe000), %g3
Current function is JNIEnv_::CallObjectMethod
860 result = functions->CallObjectMethodV(this,obj,methodID,args);
(/usr/dist/share/forte_dev/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) where
current thread: t@1
[1] 0xfac38df0(0xf4f17960, 0xf4f17970, 0xf4f1c808, 0x0, 0x27470, 0xffbef4fc), at 0xfac38def
[2] 0xfac3c22c(0xf4f17970, 0x2ae430, 0x237c8, 0xff316000, 0x27470, 0xffbef55c), at 0xfac3c22b
[3] 0xfac3bfa0(0xf4f178e8, 0x1, 0xff322918, 0xffbefb00, 0x1e, 0xe), at 0xfac3bf9f
[4] 0xff34b454(0xffbef770, 0xffbef970, 0xa, 0xf88bc630, 0x7b0d0, 0xffbef8a4), at 0xff34b453
[5] JavaCalls::call_helper(0xffbef968, 0xff316000, 0xffbef89c, 0x27470, 0x7b0d0, 0xffbef970), at 0xff101a54
[6] JavaCalls::call(0xffbef968, 0xffbef87c, 0xffbef89c, 0x27470, 0xffbef884, 0xffbef814), at 0xff1016e4
[7] jni_invoke(0x1, 0x27470, 0xe58ec, 0x1, 0x1405f0, 0xffbef94c), at 0xff1156c0
[8] jni_CallObjectMethodV(0xff316000, 0x27470, 0xe58ec, 0xffbefa40, 0x1405f0, 0x274fc), at 0xff194868
=>[9] JNIEnv_::CallObjectMethod(this = 0x274fc, obj = 0xe58ec, methodID = 0x1405f0, ...), line 860 in "jni.h"
[10] jni_to_rmi_loop(), line 63 in "jrmitest_main2.cpp"
[11] main(), line 21 in "jrmitest_main2.cpp"
(/usr/dist/share/forte_dev/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) quit
detaching from process 1548
---------------
Please find attached testcase.jar attached to the bug report.
It fails with both client and Server VM and also using -Xint
release : 1.3.1
subcategory : native_interface
type : bug
synopsis : C++ program using JNI to make RMI calls to server encounters JVM issue
description : FULL PRODUCT VERSION :
java version "1.3.1_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_02-b02)
Java HotSpot(TM) Client VM (build 1.3.1_02-b02, mixed mode)
java -version
java version "1.3.1_03"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_03-b03)
Java HotSpot(TM) Client VM (build 1.3.1_03-b03, mixed mode)
FULL OPERATING SYSTEM VERSION :
SunOS devblade 5.8 Generic_108528-13 sun4u sparc SUNW,Sun-
Blade-100
ADDITIONAL OPERATING SYSTEMS :
HP-UX 11.00 (using HP's 1.3.1_02 JRE)
EXTRA RELEVANT SYSTEM CONFIGURATION :
- This problem also occurs with J2RE 1.4.0 on Solaris 8
(applied patch cluster on Sun website circa 15.Feb.2002)
- This problem also occurs if using the "alternate"
libthread library for Solaris 8 in /usr/lwp/ (the 1-to-1
library to be included in Solaris 9 by default).
- This problem does NOT appear to occur on Win32.
A DESCRIPTION OF THE PROBLEM :
We have a multi-threaded native (C++) application which
must communicate with a Java server. The interface to the
Java server is an API which uses RMI for communication. We
rely heavily on the ability to perform JNI calls to access
this Java API; those calls then immediately dispatch RMI
remote methods to the server.
We are seeing that under heavy loads - even when there is
only one thread running in the application - that repeated
native calls into Java via JNI which then invoke
corresponding RMI procedures cause the JVM to abort
processing for the given thread during the execution of the
JNI->RMI function calls for that thread. Outside of the
native debugger (dbx), we witness a hang for that
particular thread, or worse a total crash of the
application with a stack trace from the JVM pointing to the
RMI call that has caused a signal and subsequent
termination (given enough time). Consistently, the point
of failure occurs in some RMI method that was invoked by a
JNI call. What is not consistent is the precise method
name where the failure occurs following the start of the
RMI remote method call, except that it is often in a native
method which is part of the JVM implementation (e.g,
java.lang.Class.getName)
Running the program under dbx clearly shows that the
trouble always begins with some JNI call such
as "CallObjectMethod". The most information I can get
about a given crash is when starting the program from one
shell, then attaching to the process with dbx in another
shell, then typing "cont". After several minutes, we see:
------------
t@1 (l@1) signal USR1 (User Signal 1) in (unknown) at
0xfb0091fc
0xfb0091fc: st %g0, [%sp + %g3]
Current function is JNIEnv_::CallObjectMethod
865 result = functions->CallObjectMethodV
(this,obj,methodID,args);
-----------
- This problem seems to occur whenever the (native) RMI
client is on the same host as the RMI server, but not when
the RMI server is on a different host.
- We thought this could have something to do with the RMI
distributed garbage collector, but not too sure about that.
- We assume this has not been addressed in J2RE 1.4.0 as we
tried that for Solaris Sparc and observed the same behavior.
- We think it has to do with JNI->RMI as we replaced the
RMI with straight socket functions and it didn't happen
there, and it doesn't happen with a pure Java RMI client.
** This is possibly the same as Bug ID 4493927, but it is
hard to tell due to the lack of detailed information in
that report.
The below sources are enough to reproduce the problem.
Give it a few minutes when running under dbx Note that
some of the code to manipulate the JNI local/global
references may seem strange - this is because I have
attempted to mimic what happens in our real application
over a length of time. Of course we don't sit in an
infinite loop in our actual application; having several
threads running at once and hitting these types of JNI->RMI
calls regularly causes the bug to occur.
---------------
java -version
java version "1.3.1_03"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_03-b03)
Java HotSpot(TM) Client VM (build 1.3.1_03-b03, mixed mode)
JNI method called
Returned from jenv->CallObjectMethod
JNI method called
Returned from jenv->CallObjectMethod
An unexpected exception has been detected in native code outside the VM.
Unexpected Signal : 11 occurred at PC=0x1a75b8
Function name=(N/A)
Library=(N/A)
NOTE: We are unable to locate the function name symbol for the error
just occurred. Please refer to release documentation for possible
reason and solutions.
Current Java thread:
at java.lang.System.currentTimeMillis(Native Method)
at sun.rmi.transport.tcp.TCPConnection.isDead(TCPConnection.java:155)
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:155)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:78)
at com.on.RMIJNITest.RMIJNITestServer_Stub.getNumberContainer(Unknown Source)
at com.on.RMIJNITest.RMIJNITestClient.getContainer(RMIJNITestClient.java:36)
Dynamic libraries:
0x10000 ./jrmitest_main2
0xff000000 /net/koori/onestop/jdk/1.3.1_03/latest/binaries/solsparc/jre/lib/sparc/libjvm.so
0xff370000 /usr/lib/libpthread.so.1
0xfef00000 /usr/lib/libnsl.so.1
0xfefe0000 /usr/lib/libsocket.so.1
0xfefb0000 /usr/dist/share/forte_dev/SUNWspro/lib/libCrun.so.1
0xff390000 /usr/lib/libdl.so.1
0xfec80000 /usr/dist/share/forte_dev/SUNWspro/lib/libCstd.so.1
0xfeec0000 /usr/lib/libm.so.1
0xfeef0000 /usr/lib/libw.so.1
0xfec40000 /usr/lib/libthread.so.1
0xfeb00000 /usr/lib/libc.so.1
0xfee80000 /usr/lib/libmp.so.2
0xfeea0000 /usr/platform/SUNW,Sun-Blade-1000/lib/libc_psr.so.1
0xfebf0000 /usr/dist/share/forte_dev,v6.2/SUNWspro/WS6U2/usr/lib/cpu/sparcv8plus/libCstd.so.1
0xfeac0000 /net/koori.sfbay/a/v01/jdk/1.3.1_03/fcs/binaries/solsparc/jre/lib/sparc/native_threads/libhpi.so
0xfea90000 /net/koori.sfbay/a/v01/jdk/1.3.1_03/fcs/binaries/solsparc/jre/lib/sparc/libverify.so
0xfea40000 /net/koori.sfbay/a/v01/jdk/1.3.1_03/fcs/binaries/solsparc/jre/lib/sparc/libjava.so
0xfea10000 /net/koori.sfbay/a/v01/jdk/1.3.1_03/fcs/binaries/solsparc/jre/lib/sparc/libzip.so
0xfe890000 /net/koori.sfbay/a/v01/jdk/1.3.1_03/fcs/binaries/solsparc/jre/lib/sparc/libnet.so
0xfe870000 /usr/lib/nss_nis.so.1
Local Time = Tue Apr 2 12:23:41 2002
Elapsed Time = 517
#
# The exception above was detected in native code outside the VM
#
# Java VM: Java HotSpot(TM) Client VM (1.3.1_03-b03 mixed mode)
#
# An error report file has been saved as hs_err_pid1548.log.
# Please refer to the file for further information.
#
Abort
detected a multithreaded program
Attached to process 1548 with 9 LWPs
t@0 (l@1) stopped in ___lwp_cond_wait at 0xfeb9c07c
0xfeb9c07c: ___lwp_cond_wait+0x0004: ta 0x8
(/usr/dist/share/forte_dev/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) cont
t@1 (l@6) signal USR1 (User Signal 1) in (unknown) at 0xfac38df0
0xfac38df0: sethi %hi(0xffffe000), %g3
Current function is JNIEnv_::CallObjectMethod
860 result = functions->CallObjectMethodV(this,obj,methodID,args);
(/usr/dist/share/forte_dev/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) where
current thread: t@1
[1] 0xfac38df0(0xf4f17960, 0xf4f17970, 0xf4f1c808, 0x0, 0x27470, 0xffbef4fc), at 0xfac38def
[2] 0xfac3c22c(0xf4f17970, 0x2ae430, 0x237c8, 0xff316000, 0x27470, 0xffbef55c), at 0xfac3c22b
[3] 0xfac3bfa0(0xf4f178e8, 0x1, 0xff322918, 0xffbefb00, 0x1e, 0xe), at 0xfac3bf9f
[4] 0xff34b454(0xffbef770, 0xffbef970, 0xa, 0xf88bc630, 0x7b0d0, 0xffbef8a4), at 0xff34b453
[5] JavaCalls::call_helper(0xffbef968, 0xff316000, 0xffbef89c, 0x27470, 0x7b0d0, 0xffbef970), at 0xff101a54
[6] JavaCalls::call(0xffbef968, 0xffbef87c, 0xffbef89c, 0x27470, 0xffbef884, 0xffbef814), at 0xff1016e4
[7] jni_invoke(0x1, 0x27470, 0xe58ec, 0x1, 0x1405f0, 0xffbef94c), at 0xff1156c0
[8] jni_CallObjectMethodV(0xff316000, 0x27470, 0xe58ec, 0xffbefa40, 0x1405f0, 0x274fc), at 0xff194868
=>[9] JNIEnv_::CallObjectMethod(this = 0x274fc, obj = 0xe58ec, methodID = 0x1405f0, ...), line 860 in "jni.h"
[10] jni_to_rmi_loop(), line 63 in "jrmitest_main2.cpp"
[11] main(), line 21 in "jrmitest_main2.cpp"
(/usr/dist/share/forte_dev/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) quit
detaching from process 1548
---------------
Please find attached testcase.jar attached to the bug report.
It fails with both client and Server VM and also using -Xint
- relates to
-
JDK-4493927 RMI client using JNI encounters internal HotSpot error
- Closed