Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6480378

Backport 5065001, 6259348 and others to 5.0 update release

XMLWordPrintable

    • b01
    • x86
    • windows_2000, windows_2003, windows_xp

        Customer reports they are running against bug 6429965. They submitted 6 hotspot error files and a testcase. Customer's data and testcase is at:
        /net/cores.central/cores/65182343.

        My analysis shows the traces like the one in the bug report. The active thread is

         =>0x00a41e48 JavaThread "Finalizer" daemon [_thread_in_native, id=740]

        with a stack

        Stack: [0x061f0000,0x062f0000), sp=0x062ef844, free space=1022k
        Native frames: (J=compiled Java code, j=interpreted, Vv=VM code,
        C=native code)
        C [ntdll.dll+0x10f3]
        j java.awt.Font.pDispose()V+0
        J java.awt.Font.finalize()V
        v ~RuntimeStub::alignment_frame_return Runtime1 stub
        v ~StubRoutines::call_stub
        V [jvm.dll+0x86401]
        V [jvm.dll+0xdb172]
        V [jvm.dll+0x862d2]
        V [jvm.dll+0x8b623]
        C [java.dll+0x2006]
        J java.lang.ref.Finalizer.runFinalizer()V
        J java.lang.ref.Finalizer.access$100(Ljava/lang/ref/Finalizer;)V
        v ~RuntimeStub::alignment_frame_return Runtime1 stub
        j java.lang.ref.Finalizer$FinalizerThread.run()V+11

        hs_err_pid360.log is a little different as it crashed in a different
        spot in ntdll.dll.

        Here is more information from the customer.

        Background:
        Latest release of started deployment in July, with subsequent deployments since then. This release includes an upgrade from Java 1.4.2 to Java 1.5.0. A high profile customer was upgraded to this new release the weekend of 30-Sep. This customer experienced application crashes three times during the week following the upgrade. This required us to undertake the undesirable action of rolling back the customer to the last known good release.

        Problem:
        HotSpot error files obtained from the customer machines indicated an Access Violation was occurring in the JVM, 1.5.0_07-b03. Specifically, the problem occurred in the Garbage Collector's Finalizer thread in the Windows DLL ntdll.dll, at ntdll.dll + 0x10f3.

        Analysis:
        The team analyzed the stack trace from the Hot Spot error file, identified the exported function call at ntdll.dll offset 0x10f3, and analyzed the JVM source code.

        Defect:
        There is a race condition in the collaboration between the scheduleDelete() method of Awt_Object.cpp and the WM_AWT_DISPOSE window message processing of Awt_Toolkit.cpp:

        If the Finalizer thread is blocked in AwtObject::scheduleDelete() immediately after posting WM_AWT_DISPOSE, then Awt_Toolkit processing of WM_AWT_DISPOSE deletes the AwtObject instance before AwtObject has completed execution of its scheduleDelete() method.

        When the Finalizer thread resumes execution within AwtObject::scheduleDelete(), the destructor for CriticalSection::Lock is automatically invoked. This invocation causes an invalid execution to occur, resulting in the Access Violation Exception within the JVM. See enclosed Hot Spot error file for details:

        <<hs_err_pid436.log>>
        Reproducing the JVM Defect:
        Based on the analysis of the code identified above, the team has developed a simple test program. The test program proves that another thread can be scheduled by the OS in the middle of a Critical section => That the critical section execution is not an atomic operation. The awt_Font call to scheduleDelete() that posts the message for destroying itself is frozen by the OS at times, just after it has sent the message but before CriticalSection::~Lock() has been called. The awt_Toolkit thread processes the message and deletes the object. At this point, CriticalSection::~Lock() is executing illegally in a GPF (General Protection failure) Zone and causes ::LeaveCriticalSection() to crash as indicated in the HotSpot error log. See enclosed zip file containing sources, executable, and make file for this test program.

        The files are at /net/cores.central/cores/65182343
        After some discussion we have decided to completely backport some of the fixes (5065001, 6259348 and others) to 5.0 update release. So I'm changing this CR synopsis to match this change.

              art Artem Ananiev (Inactive)
              tstatt Terry Statt (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: