Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4312285

Java 2: native thread stack corruption on return from JNI call.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not an Issue
    • Icon: P1 P1
    • None
    • 1.2.2
    • hotspot
    • unknown
    • solaris_2.6

      Here is the problem description from customer:

      The main application is bascially a native application that is executing
      within a particular stack space. This main application is responsible
      for starting the JVM which now runs in the same process space as the
      main application. Once the JVM is started a user may be able to run a
      static java method (residing in a jar file) from within the main
      application process space. This static method now runs in a separate
      Java thread, which I will refer to as a 'Java UDR thread'.

      The test case can be viewed as a 'Java UDR' thread being invoked by the
      main application in a loop, which is executing about 500k times. The
      execution is without any problems for about 8k iterations (8467 to be
      specific) and then the problem occurs.

      At the point where the problem happens the native application is making
      a JNI call into the Java Main thread as follows :


          /* TEMP Value to test if the value gets written */
          rc = 299;
          mt_logprintf("Before JNI call : Error code rc = %d", rc);
          /* call upon Java to handle the rest */
          IFX_MT_ENTER_JNI_CALL;
          rc = (*JVP_env)->CallStaticIntMethod(JVP_env,
                      jvpinfo->ji_UDRDBAContextImpl_class,
                          jvpinfo->ji_UDRDBAContextImpl_openContext_mid,
                          CPTR_TO_JHANDLE(lminfo),
                      need_to_start_java_thread ? (jobject)NULL :
                      (jobject)p->jt_thread);
          IFX_MT_LEAVE_JNI_CALL;

          mt_logprintf("After JNI call : Error code rc = %d", rc);

          /*if (rc != FUNCSUCC)*/
          if (rc != 199) <-------------- CHECKING FOR THE UNIQUE RETURN
      VALUE
              {
              mt_logprintf("Java UDR context open failed Continue only if rc
      == 0 (debugging)");
              if (rc != FUNCSUCC)
                return sqerr(-9435);
              }

      The macro IFX_MT_ENTER_JNI_CALL ensures that the proper stack top/bottom
      and the sp are saved to allow for execution of the program after the JNI
      call returns. This is necessary since the Java UDR thread has a stack
      space which is distinct from the main process (main java thread) stack.
      Also note that we are checking for unique return values to determine if
      the function returned succesfully or not.

      A snippet of the code which is executed as a result of the JNI call is
      given below :

                      /* We can re-use an existing java udr thread */
                     {
                      if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",

                              informix.jvp.Traceable.TRACE_COARSE,
                              "RE-USING dbaThread : "+ udrDbaThread);
                      if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",

                              informix.jvp.Traceable.TRACE_COARSE,
                              "RE-USING dbaMonitor : "+
      udrDbaThread.dbaMonitor);
                      dbaThread = udrDbaThread;
                      dbaMonitor = (DBAMonitorImpl)udrDbaThread.dbaMonitor;
                      }

                        /* Create a new java UDR context.*/
                        /* The constructor makes p.lminfo = lminfo */
      -----> PerJavaUDRContext p = new PerJavaUDRContext(dbaName, modName,
                         dbaMonitor, lminfo);
                        dbaThread.setCurrentUDRContext(p);

                   if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",
                              informix.jvp.Traceable.TRACE_COARSE,
                              "after setCurrentUDRContext");

                   /* This makes lminfo->li_per_java_udr_context = p */
                   Object res =
                   UDRLMInfo.setUDRContext(lminfo, p);

                   if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",
                              informix.jvp.Traceable.TRACE_COARSE,
                              "after UDRLMInfo.setUDRContext : res = " + res);

                  if (res == null)
                      {
                      if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",

                              informix.jvp.Traceable.TRACE_COARSE,
                              "after setUDRContext : res == null");
                      //return JVP.FUNCFAIL;
                      return -199; <----------- RETURNING A UNIQUE VALUE
                      }

      At this point we think we are hitting a JVM bug. Let me explain
      further.

      At some point (as pointed to by the arrow above), a constructor is
      getting invoked, and causes the GC to be invoked. The GC completes but
      never returns control back to the 'java UDR thread' or causes the abrupt
      termination of this thread. The above code has been modified (for
      testing ) to return a unique value for SUCCESS (199) or FAILURE (-199).
      When the main application checks for the return value we find the value
      is 0!!. This is not possible since we are never returning a value of 0
      to the native application.

      So looks like when the GC finishes the control is not returned back to
      the proper Java thread.

      Another piece of information that may or may not be related to the problem
      is as follows:

      Right at the point where we are seeing this problem we are also
      observing a StackOverflowError being thrown. We have tried with stack
      sizes of 64k and 128k but the problem occurs at exactly the same point.
      This makes us wonder if this is in fact a StackOverflowError or a
      consequence of the GC error? (We are not sure).

      The exception trace we are oberving as a result of this is as shown
      below :

      java.lang.StackOverflowError
              at java.lang.String.<init>(String.java:246)
              at java.lang.Long.toString(Long.java:113)
              at java.lang.Long.toString(Long.java:226)
              at java.text.DigitList.set(DigitList.java:468)
              at java.text.DecimalFormat.format(DecimalFormat.java:409)
              at java.text.NumberFormat.format(NumberFormat.java:226)
              at
      java.text.SimpleDateFormat.zeroPaddingNumber(SimpleDateFormat.java:615)
              at
      java.text.SimpleDateFormat.subFormat(SimpleDateFormat.java:494)
              at java.text.SimpleDateFormat.format(SimpleDateFormat.java:415)
              at java.text.DateFormat.format(DateFormat.java:312)
              at informix.jvp.impl.LoggerImpl.log(LoggerImpl.java:121)
              at
      informix.jvp.impl.TraceStreamImpl.emit(TraceStreamImpl.java:280)
              at
      informix.jvp.impl.TraceStreamImpl.println(TraceStreamImpl.java:254)
              at
      informix.jvp.impl.TraceableImpl.tracePrint(TraceableImpl.java:280)
              at informix.jvp.impl.SemaphoreImpl.V(SemaphoreImpl.java:74)
              at
      informix.jvp.dbapplet.impl.DBAMonitorImpl.notifyDBATask(DBAMonitorImpl.java:90)

              at
      informix.jvp.dbapplet.impl.UDRDBAContextImpl.openContext(UDRDBAContextImpl.java:519)

      Attachment #1:

      stack_info.0211 - debugging info at the 8468th iteration (when things start
                        to fail. This is basically debugging info in the GC just
                        before the bug happens (or where the problem is suspected
                        to be happening).

      Customer completed another test run and put in stop points in 3 places:

          1. In our native application where the JNI call originates
          2. In the native portion of the code which is invoked by the Java method
      (which is invoked in step above)
          3. A stop point in the native application where the JNI call returns.

      Basically ---

      /*******************************************************************************************/

      /* TEMP Value to test if the value gets written */
      ] rc = 299;
      ] mt_logprintf("Before JNI call : Error code rc = %d", rc); <----
      STOP HERE (1)
      ] /* call upon Java to handle the rest */
      ] IFX_MT_ENTER_JNI_CALL;
      ] rc = (*JVP_env)->CallStaticIntMethod(JVP_env,
      ] jvpinfo->ji_UDRDBAContextImpl_class,
      ] jvpinfo->ji_UDRDBAContextImpl_openContext_mid, <----- java
      method (code snippet below)
      ] CPTR_TO_JHANDLE(lminfo),
      ] need_to_start_java_thread ? (jobject)NULL :
      ] (jobject)p->jt_thread);
      ] IFX_MT_LEAVE_JNI_CALL;
      ]
      ] mt_logprintf("After JNI call : Error code rc = %d", rc); <----- STOP HERE
      (3)
      /*******************************************************************************************/

      /*================= native code which gets invoked by the Java code
      ========================*/

      /* Class: informix_jvp_dbapplet_impl_UDRLMInfo
       * Method: setUDRContext
       * Signature: (JLinformix/jvp/dbapplet/impl/PerJavaUDRContext;)V
       */
      JNIEXPORT jobject JNICALL Java_informix_jvp_dbapplet_impl_UDRLMInfo_setUDRContext
        (JNIEnv *env, jclass cls, jlong lmhandle, jobject udrC)
      {
          lminfo_t *lminfo;
          jobject ref = NULL;

         mt_logprintf("coming into func setUDRContext: udrC = 0x%x\n", udrC);
      <----------STOP (2)
          lminfo = JHANDLE_TO_CPTR(lminfo_t *, lmhandle);
          /* make a JNI global ref since we are hiding this pointer from the VM */
          ref = (*env)->NewGlobalRef(env, udrC);
          if (!ref)
              {
              mt_logprintf("out of memory NewGlobalRef in setUDRContext:
      perJavaContext");
              return ref;
              }
          lminfo->li_per_java_udr_context = ref;
      mt_logprintf("coming out func setUDRContext: ref = 0x%x\n", ref);
      mt_logprintf("coming out func setUDRContext: lminfo->li_per_java_udr_context =
      0x%x\n", lminfo->li_per_java_udr_context);
          return ref;
      }
      /*================= native code which gets invoked by the Java code
      ========================*/


      /****** Java PORTION OF THE CODE --- FOR REFERENCE ------*/

                      }
                    else
                      {
                      /* We can re-use an existing dba java udr thread */
                      if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",
                              informix.jvp.Traceable.TRACE_COARSE,
                              "RE-USING dbaThread : "+ udrDbaThread);
                      if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",
                              informix.jvp.Traceable.TRACE_COARSE,
                              "RE-USING dbaMonitor : "+ udrDbaThread.dbaMonitor);
                      dbaThread = udrDbaThread;
                      dbaMonitor = (DBAMonitorImpl)udrDbaThread.dbaMonitor;
                      }

                   /* Create a new java UDR context.*/
                   /* The constructor makes p.lminfo = lminfo */
                   PerJavaUDRContext p = new PerJavaUDRContext(dbaName, modName, <---
      CONSTRUCTOR (We strongly suspect the GC gets invoked here. I will gather more
      conclusive evidence too. But the debug trace and logging info I have here
      ALSO PROVE THE POINT)

                      dbaMonitor, lminfo);

                   dbaThread.setCurrentUDRContext(p); <----------
      THIS CORRESPONDS TO THE NATIVE FUNCTION WITH STOP POINT (2)

                   if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",
                              informix.jvp.Traceable.TRACE_COARSE,
                              "after setCurrentUDRContext");

                   /* This makes lminfo->li_per_java_udr_context = p */
                   Object res =
                   UDRLMInfo.setUDRContext(lminfo, p);

                   if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",
                              informix.jvp.Traceable.TRACE_COARSE,
                              "after UDRLMInfo.setUDRContext : res = " + res);

                   if (res == null)
                      {
                      if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",
                              informix.jvp.Traceable.TRACE_COARSE,
                              "after setUDRContext : res == null");
                      //return JVP.FUNCFAIL;
                      return -199;
                      }
       if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",
                              informix.jvp.Traceable.TRACE_COARSE,
                              "NEW : notifying DBATask : INIT");

                   dbaMonitor.notifyDBATask(DBAMonitor.DBA_INIT);

                   if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Context",
                           Traceable.TRACE_SUPERFINE, "done monitor: " +
                              dbaMonitor.toString() + " " +
                              dbaMonitor.getName());

                   //return JVP.FUNCSUCC;
                   return 199;
                   }
                   catch (Exception exce)
                   {
                      if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",
                              informix.jvp.Traceable.TRACE_COARSE,
                              "UNEXPECTED EXCEPTION");
                      JVPError.unexpectedException(false, exce);
                      //return JVP.FUNCFAIL;
                      return -199;
                   }
                   finally {

                      if (Debug.ON) JVP.traceable.tracePrint("JVP.DBA.Thread",
                              informix.jvp.Traceable.TRACE_COARSE,
                              "IN FINAL BLOCK : UDRDBAContextImpl : openContext");
                   }

      /****** END : Java PORTION OF THE CODE --- FOR REFERENCE ------*/

      /********* TRACE CORRESPONDING TO JAVA UDR THREAD EXECUTION ************/
      2000/02/11 19:30:22.939 JVP#3: getUniqueThreadName : add modName
      2000/02/11 19:30:22.944 JVP#3: Trace: JVP.DBA.Thread: RE-USING dbaThread :
      Thread[test.mahmed.test1_jar:Test1#1,9,test.mahmed.test1_jar:Test1#1],state = IDLE
      T#4
      2000/02/11 19:30:22.948 JVP#3: Trace: JVP.DBA.Thread: RE-USING dbaMonitor :
      informix.jvp.dbapplet.impl.DBAMonitorImpl@619b8937
      2000/02/11 19:30:22.971 JVP#3: Trace: JVP.DBA.PerJavaUDRContext: CONSTRUCTOR :
      PerJavaUDRContext
      2000/02/11 19:30:23.010 JVP#3: Trace: JVP.DBA.Thread: IN FINAL BLOCK :
      UDRDBAContextImpl : openContext

      /***************************************************************/

      The results I see prove that the JVM does not execute the java code for the Java
      UDR thread in its entirety.

      The GC is (most probably) getting invoked at the point where the CONSTRUCTOR for
      PerJavaUDRContext is getting invoked. After this trace print, I only see a trace
      print coming out of the FINAL block. Also for all iterations my stop points are
      getting triggered in the proper order

      i.e. BEFORE JNI call ----

                  in the native functio setUDRContext()

            AFTER JNI calll



      But for the iteration where the problem is, the stop points get triggered like this



          BEFORE JNI


          AFTER JNI call


      (NOTE THAT THE STOP in the native function being invoked by the Java thread IS NOT TRIGGERED)


      Attachment #2:

      debugtrace - debugger output of all threads.






            duke J. Duke
            mchansunw Mei Chan (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: