Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6414270

Issue of a critical JDK/JNI related issue in customer product

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not an Issue
    • Icon: P2 P2
    • None
    • 1.4.2_08
    • core-libs
    • None

      ******************** JBOSS FAILURE ********************
      JBoss not ready, retrying
      javax.naming.CommunicationException [Root exception is java.rmi.ConnectException: Connection refused to host: 47.153.163.11; nested exception is:
              java.net.ConnectException: Connection refused]
              at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:647)
              at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:507)
              at javax.naming.InitialContext.lookup(InitialContext.java:347)
              at fengdong.test.NamingTest.getRmiConnector(NamingTest.java:118)
              at fengdong.test.NamingTest$JBossMonitor.getConnection(NamingTest.java:221)
              at fengdong.test.NamingTest$JBossMonitor.handleJBossFailure(NamingTest.java:243)
              at fengdong.test.NamingTest$JBossMonitor.checkJBoss(NamingTest.java:210)
              at fengdong.test.NamingTest$JBossMonitor.run(NamingTest.java:188)
      Caused by: java.rmi.ConnectException: Connection refused to host: 47.153.163.11; nested exception is:
              java.net.ConnectException: Connection refused
              at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:567)
              at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:185)
              at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:171)
              at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:101)
              at org.jnp.server.NamingServer_Stub.lookup(Unknown Source)
              at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:528)
              ... 7 more
      Caused by: java.net.ConnectException: Connection refused
              at java.net.PlainSocketImpl.socketConnect(Native Method)
              at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:305)
              at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:171)
              at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:158)
              at java.net.Socket.connect(Socket.java:452)
              at java.net.Socket.connect(Socket.java:402)
              at java.net.Socket.<init>(Socket.java:309)
              at java.net.Socket.<init>(Socket.java:124)
              at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22)
              at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128)
              at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:562)
              ... 12 more
      Enter push() 2. <------------- C++ native thread is blocking here because it is trying to obtain a lock.
      JBoss not ready, retrying
      javax.naming.CommunicationException: Receive timed out [Root exception is java.net.SocketTimeoutException: Receive timed out]
              at org.jnp.interfaces.NamingContext.discoverServer(NamingContext.java:1115)
              at org.jnp.interfaces.NamingContext.checkRef(NamingContext.java:1192)
              at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:514)
              at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:507)
              at javax.naming.InitialContext.lookup(InitialContext.java:347)
              at fengdong.test.NamingTest.getRmiConnector(NamingTest.java:118)
              at fengdong.test.NamingTest$JBossMonitor.getConnection(NamingTest.java:221)
              at fengdong.test.NamingTest$JBossMonitor.handleJBossFailure(NamingTest.java:243)
              at fengdong.test.NamingTest$JBossMonitor.checkJBoss(NamingTest.java:210)
              at fengdong.test.NamingTest$JBossMonitor.run(NamingTest.java:188)
      Caused by: java.net.SocketTimeoutException: Receive timed out
              at java.net.PlainDatagramSocketImpl.receive(Native Method)
              at java.net.DatagramSocket.receive(DatagramSocket.java:711)
              at org.jnp.interfaces.NamingContext.discoverServer(NamingContext.java:1093)
              ... 9 more
      RMI Connection Service not ready, retrying <----------------- JBoss started here. So the communication exception disappear after this point.
      javax.naming.NameNotFoundException: jmx not bound
              at org.jnp.server.NamingServer.getBinding(NamingServer.java:495)
              at org.jnp.server.NamingServer.getBinding(NamingServer.java:503)
              at org.jnp.server.NamingServer.getObject(NamingServer.java:509)
              at org.jnp.server.NamingServer.lookup(NamingServer.java:253)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
              at java.lang.reflect.Method.invoke(Method.java:324)
              at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:261)
              at sun.rmi.transport.Transport$1.run(Transport.java:148)
              at java.security.AccessController.doPrivileged(Native Method)
              at sun.rmi.transport.Transport.serviceCall(Transport.java:144)
              at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:460)
              at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:701)
              at java.lang.Thread.run(Thread.java:534)
              at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:247)
              at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:223)
              at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:133)
              at org.jnp.server.NamingServer_Stub.lookup(Unknown Source)
              at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:528)
              at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:507)
              at javax.naming.InitialContext.lookup(InitialContext.java:347)
              at fengdong.test.NamingTest.getRmiConnector(NamingTest.java:118)
              at fengdong.test.NamingTest$JBossMonitor.getConnection(NamingTest.java:221)
              at fengdong.test.NamingTest$JBossMonitor.handleJBossFailure(NamingTest.java:243)
              at fengdong.test.NamingTest$JBossMonitor.checkJBoss(NamingTest.java:210)
              at fengdong.test.NamingTest$JBossMonitor.run(NamingTest.java:188)
      Obtain RMI Connection succeed.
      JBossMonitor Check JBoss Complete. <---------------- JBoss fully recovered.
      Check JBoss by MBean: push
      Check JBoss by MBean Complete: push. Reconnect: true
      Recovery is call initMb(): push
      javax.naming.NoInitialContextException: Cannot instantiate class: org.jnp.interfaces.NamingContextFactory [Root exception is java.lang.ClassNotFoundException: org/jnp/interfaces/NamingContextFactory] <---------------- Unexpected exception appear in C++ thread.
              at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:652)
              at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:243)
              at javax.naming.InitialContext.init(InitialContext.java:219)
              at javax.naming.InitialContext.<init>(InitialContext.java:195)
              at fengdong.test.NamingTest.getRmiConnector(NamingTest.java:115)
              at fengdong.test.NamingTest.initTest(NamingTest.java:74)
              at fengdong.test.NamingTest$JBossMonitor.checkJBoss(NamingTest.java:206)
              at fengdong.test.NamingTest.push(NamingTest.java:99)
      Caused by: java.lang.ClassNotFoundException: org/jnp/interfaces/NamingContextFactory
              at java.lang.Class.forName0(Native Method)
              at java.lang.Class.forName(Class.java:219)
              at com.sun.naming.internal.VersionHelper12.loadClass(VersionHelper12.java:42)
              at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:649)
              ... 7 more
      javax.naming.NoInitialContextException: Cannot instantiate class: org.jnp.interfaces.NamingContextFactory [Root exception is java.lang.ClassNotFoundException: org/jnp/interfaces/NamingContextFactory]
              at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:652)
              at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:243)
              at javax.naming.InitialContext.init(InitialContext.java:219)
              at javax.naming.InitialContext.<init>(InitialContext.java:195)
              at fengdong.test.NamingTest.getRmiConnector(NamingTest.java:115)
              at fengdong.test.NamingTest.initTest(NamingTest.java:74)
              at fengdong.test.NamingTest$JBossMonitor.checkJBoss(NamingTest.java:206)
              at fengdong.test.NamingTest.push(NamingTest.java:99)
      Caused by: java.lang.ClassNotFoundException: org/jnp/interfaces/NamingContextFactory
              at java.lang.Class.forName0(Native Method)
              at java.lang.Class.forName(Class.java:219)
              at com.sun.naming.internal.VersionHelper12.loadClass(VersionHelper12.java:42)
              at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:649)
              ... 7 more

      ============ Comments ===========


      ============ TestCase ===========

      Full Blown Test Case is present at :
        /tmp/830445/naming-issue-environment-2.tar.gz on titan.sfbay.sun.com

      Test case has 'make.sh' to build the test program and then launch it using 'naming.sh' script.

      ============ Workaround ===========

      N/A

      ============ Justfication ===========
      The problem of NoInitialContextException is summarized as follows:
      The simplified overall structure of the system where the problem is encountered is depicted as following picture.
      Picture (Metafile)
      The system is used to mediate (forward) message passed from southbound through C++ interface to northbound Java interface. The upper two layers are pure Java and the lowest layer, Device Adapter is a process mixed C++ with JVM through JNI technical. The Device Adapter is running as a C++ process and creates JVM inside itself by JNI Invocation API.
      The problem occurs in the process of Device Adapter and the following classes are highly related to the problem:

      1. Class MessageBroker.
      This is a Java class defined by Nortel design team. Its functionality is to encapsulate all aspects related to JBoss JMS services and provide the designer a Facade interface. In a Device Adapter, the object of this class is created and invoked by C++ code through JNI Invocation API. Particularly, its pushMessage() method is frequently invoked by tens of C++ native threads. Method pushMessage() is a synchronized method of MessageBroker. From within pushMessage(), JMS, JNDI, and Mbean services provided by JBoss are used heavily.

      2. Class JBossHealthMonitor.
      This is a non-static inner class of class MessageBroker. It is a sub-class of java.lang.Thread. Particularly, its checkJBoss() method is used to check the health of JBoss. Method checkJBoss() is invoked in two scenarios. The first scenario is that it is invoked from the thread implemented by JBossHealthMonitor periodically. The second scenario is that it is invoked by MessageBroker.pushMessage() if the access to a JBoss service within pushMessage() results an exception. Method checkJBoss() is a synchronized method of JBossHealthMonitor.

      Method checkJBoss() checks the health of JBoss by the following operations:
      1. It tries to obtain the remote reference to an RMI object provided by JBoss.
      2. When step 1 has succeeded, it tries to perform a light-weight operation provided by that RMI object, which checks the server information of JBoss.

      If both steps succeed, checkJBoss() will return. If either of them fails, checkJBoss() will sleep for a while and try the same operation until it succeeds to perform all steps. That means checkJBoss() will not return until it thinks JBoss is ready.


      In the case of JBoss failure, either the thread of Class JBossHealthMonitor or one of the native thread invoking MessageBroker.pushMessage() will detect the failure firstly. Until JBoss recovers, the thread detecting the failure will block within checkJBoss() because of the retrying loop, and all other threads invoking checkJBoss() will block when they are trying to obtain the lock of JBossHealthMonitor before entering the method.

      We find a NoInitialContextException is sometimes, but not always, thrown from checkJBoss() if all the following conditions are met:

      1. It is the thread of Class JBossHealthMonitor detects the failure firstly.
      2. Before JBoss recovering, at least one native thread has invoked checkJBoss() and therefore blocked before entering the method.

      Then, after JBoss recovers, the checkJBoss() in the thread of Class JBossHealthMonitor returns normally. And the blocked native thread enter method checkJBoss(). And a invocation to "new InitialContext()" throws a NoInitialContextException as follows:

      javax.naming.NoInitialContextException: Cannot instantiate class: org.jnp.interfaces.NamingContextFactory [Root exception is java.lang.ClassNotFoundException: org/jnp/interfaces/NamingContextFactory]

              at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:652)
              at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:243)
              at javax.naming.InitialContext.init(InitialContext.java:219)
              at javax.naming.InitialContext.<init>(InitialContext.java:175)
              at com.nortel.is.mx.util.ADKMxUtil.getRMIConnector(Unknown Source)
              at com.nortel.is.session.RMISessionBroker.connect(Unknown Source)
              at com.nortel.is.session.RMISessionBroker.resetConnections(Unknown Source)
              at com.nortel.is.messaging.MessageBroker.initMb(MessageBroker.java:325)
              at com.nortel.is.messaging.MessageBroker.access$400(MessageBroker.java:71)
              at com.nortel.is.messaging.MessageBroker$JBossHealthMonitor.checkJBoss(MessageBroker.java:1479)
              at com.nortel.is.messaging.MessageBroker.pushMessage(MessageBroker.java:1214)
      Caused by: java.lang.ClassNotFoundException: org/jnp/interfaces/NamingContextFactory
              at java.lang.Class.forName0(Native Method)
              at java.lang.Class.forName(Class.java:219)
              at com.sun.naming.internal.VersionHelper12.loadClass(VersionHelper12.java:42)
              at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:649)
              ... 10 more
             
             
      javax.naming.NoInitialContextException: Cannot instantiate class: org.jnp.interfaces.NamingContextFactory [Root exception is java.lang.ClassNotFoundException: org/jnp/interfaces/NamingContextFactory]

              at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:652)
              at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:243)
              at javax.naming.InitialContext.init(InitialContext.java:219)
              at javax.naming.InitialContext.<init>(InitialContext.java:175)
              at com.nortel.is.mx.util.ADKMxUtil.getRMIConnector(Unknown Source)
              at com.nortel.is.messaging.MessageBroker$JBossHealthMonitor.getConnection(MessageBroker.java:1462)
              at com.nortel.is.messaging.MessageBroker$JBossHealthMonitor.handleJBossFailure(MessageBroker.java:1525)
              at com.nortel.is.messaging.MessageBroker$JBossHealthMonitor.checkJBoss(MessageBroker.java:1513)
              at com.nortel.is.messaging.MessageBroker.pushMessage(MessageBroker.java:1240)
      Caused by: java.lang.ClassNotFoundException: org/jnp/interfaces/NamingContextFactory
              at java.lang.Class.forName0(Native Method)
              at java.lang.Class.forName(Class.java:219)
              at com.sun.naming.internal.VersionHelper12.loadClass(VersionHelper12.java:42)
              at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:649)
              ... 8 more

      We are sure that class org.jnp.interfaces.NamingContextFactory is in the classpath correctly. Actually, we added an invocation as follows immediately before the invocation of "new InitialContext()":

      Class.forName("org.jnp.interfaces.NamingContextFactory");

      And the above code executes quietly without any exception thrown.

      According to our understanding to JDK and JNDI document, a NoInitialContextException should not be thrown as long as the classpath is correct. However, the actual behavior of JDK/JNDI seems violating the document because the classpath is proven correct by an invocation to Class.forName() and the exception is still thrown. More unfortunately, this violation can not be walked-around programmatically because no matter how many times of retries still result the exception samely.

      A suspicious difference between "new InitialContext()" and Class.forName() is they use different class loaders. By reading JDK source code, we discover that the former uses a ContextLoader while the latter uses the caller's loader. Due to the limitation of knowledge to JVM interior, we have so far not known the exact differences, the side-effect of those differences and how to walk around the problem.

      The OS in use is Solaris 9. The JDK version is 1.4.2_08. JBoss version is 3.2.2.
      Since the previous mail was sent, we have further eased the reproducing of the problem and therefore narrowed down the scope of the investigation. Now the problem can be reproduced by a simple test program almost every time. The test program is made up of about 200 hundred of C++/Java code and relies on Rogewave C++ library and JBoss only.
      File NamingTest.cpp and NamingTest.java are C++ and Java source code of the test program. File make.sh is the make file of the test program. File naming.sh is the launch script of the test program.
       

      Following is two examples of logs generated by the test program, for the normal running and the problem reproducing respectively.

      When JBoss is running and the test program is launched, you will see the logs like follows:
      zsups311# ./naming.sh
      Create JVM: 0.
      Attach to Current Thread: 0.
      Find Class: 4c2b4.
      Find Method: 109548, 109568.
      Instance JBossMonitor.
      Obtain RMI Connection succeed.
      Instance JBossMonitor Complete.
      JBossMonitor Thread Start.
      JBossMonitor Check JBoss.
      Check JBoss by MBean: monitor
      Check JBoss by MBean Complete: monitor. Reconnect: false
      JBossMonitor Check JBoss Complete.
      Create Object: 4c2b0.
      Enter thread body.
      Attach to Current Thread: 0.
      Enter push() 2.
      Check JBoss by MBean: push
      Check JBoss by MBean Complete: push. Reconnect: false
      Leave push().
      JBossMonitor Check JBoss.
      Check JBoss by MBean: monitor
      Check JBoss by MBean Complete: monitor. Reconnect: false
      JBossMonitor Check JBoss Complete.
      Enter push() 2.
      Check JBoss by MBean: push
      Check JBoss by MBean Complete: push. Reconnect: false
      Leave push().
      JBossMonitor Check JBoss.
      Check JBoss by MBean: monitor
      Check JBoss by MBean Complete: monitor. Reconnect: false
      JBossMonitor Check JBoss Complete.
       
      Logs in red and blue are both generated by Java code. The red logs are generated in C++ native thread, while the blue ones are generated in a pure Java thread.
      If we shutdown JBoss and then restart it, we will see the logs as follows. Please notice the important state change point and the unexpected problem occurred at last (marked as red).
       
      Enter push() 2.
      Check JBoss by MBean: push
      Check JBoss by MBean Complete: push. Reconnect: false
      Leave push().
      JBossMonitor Check JBoss.
      Check JBoss by MBean: monitor
      Check JBoss by MBean Complete: monitor. Reconnect: false
      JBossMonitor Check JBoss Complete.
      JBossMonitor Check JBoss.
      Check JBoss by MBean: monitor <--------- JBoss shutdown at this point
      java.rmi.ConnectException: Connection refused to host: 47.153.163.11; nested exception is:
              java.net.ConnectException: Connection refused
              at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:567)
              at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:185)
              at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:171)
              at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:101)
              at org.jboss.invocation.jrmp.server.JRMPInvoker_Stub.invoke(Unknown Source)
              at org.jboss.invocation.jrmp.interfaces.JRMPInvokerProxy.invoke(JRMPInvokerProxy.java:135)
              at org.jboss.invocation.InvokerInterceptor.invoke(InvokerInterceptor.java:96)
              at org.jboss.jmx.connector.invoker.client.InvokerAdaptorClientInterceptor.invoke(InvokerAdaptorClientInterceptor.java:58)
              at org.jboss.proxy.SecurityInterceptor.invoke(SecurityInterceptor.java:45)
              at org.jboss.proxy.ClientMethodInterceptor.invoke(ClientMethodInterceptor.java:55)
              at org.jboss.proxy.ClientContainer.invoke(ClientContainer.java:85)
              at $Proxy0.getMBeanInfo(Unknown Source)
              at fengdong.test.NamingTest$JBossMonitor.checkJBoss(NamingTest.java:202)
              at fengdong.test.NamingTest$JBossMonitor.run(NamingTest.java:188)
      Caused by: java.net.ConnectException: Connection refused
              at java.net.PlainSocketImpl.socketConnect(Native Method)
              at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:305)
              at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:171)
              at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:158)
              at java.net.Socket.connect(Socket.java:452)
              at java.net.Socket.connect(Socket.java:402)
              at java.net.Socket.<init>(Socket.java:309)
              at java.net.Socket.<init>(Socket.java:124)
              at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22)
              at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128)
              at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:562)
              ... 13 more

            jhangalsunw Jayalaxmi Hangal (Inactive)
            sdattatrsunw Sreenatha Dattatri (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: