Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8318109

Writing JFR records while a CHT has taken its lock asserts in rank checking

    XMLWordPrintable

Details

    • gc
    • b21
    • Verified

    Backports

      Description

        The following test failed in the JDK22 CI:

        applications/kitchensink/Kitchensink.java

        Here's a snippet from the log file:

        [2023-10-13T18:30:10.562876800Z] Gathering output for process 26612
        [2023-10-13T18:30:10.917412600Z] Waiting for completion for process 26612
        [2023-10-13T18:30:10.934455800Z] Waiting for completion finished for process 26612
        [stress.process.out] #
        [stress.process.out] # A fatal error has been detected by the Java Runtime Environment:
        [stress.process.out] #
        [stress.process.out] # Internal Error (c:\\sb\\prod\\1697208132\\workspace\\open\\src\\hotspot\\share\\runtime\\mutex.cpp:448), pid=39360, tid=41536
        [stress.process.out] # assert(false) failed: Attempting to acquire lock JfrStacktrace_lock/stackwatermark-1 out of order with lock ConcurrentHashTableResize_lock/service-1 -- possible deadlock
        [stress.process.out] #
        [stress.process.out] # JRE version: Java(TM) SE Runtime Environment (22.0+20) (fastdebug build 22-ea+20-1513)
        [stress.process.out] # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 22-ea+20-1513, mixed mode, sharing, tiered, compressed class ptrs, z gc, windows-amd64)
        [stress.process.out] # Core dump will be written. Default location: C:\\sb\\prod\\1697220906\\testoutput\\test-support\\jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink_java\\scratch\\0\\hs_err_pid39360.mdmp
        [stress.process.out] #
        [stress.process.out] # JFR recording file will be written. Location: C:\\sb\\prod\\1697220906\\testoutput\\test-support\\jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink_java\\scratch\\0\\hs_err_pid39360.jfr
        [stress.process.out] #
        [stress.process.out] Unsupported internal testing APIs have been used.
        [stress.process.out]
        [stress.process.out] # An error report file with more information is saved as:
        [stress.process.out] # C:\\sb\\prod\\1697220906\\testoutput\\test-support\\jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink_java\\scratch\\0\\hs_err_pid39360.log
        [stress.process.out] #
        [stress.process.out] # If you would like to submit a bug report, please visit:
        [stress.process.out] # https://bugreport.java.com/bugreport/crash.jsp
        [stress.process.out] #
        [2023-10-13T18:31:10.986027400Z] Gathering output for process 41560
        [2023-10-13T18:31:12.106450700Z] Waiting for completion for process 41560
        [2023-10-13T18:31:12.106450700Z] Waiting for completion finished for process 41560


        Here's the crashing thread's stack:

        --------------- T H R E A D ---------------

        Current thread (0x000001bd05db61a0): JavaThread "JFR Periodic Tasks" daemon [_thread_in_vm, id=41536, stack(0x0000008f77900000,0x0000008f77a00000) (1024K)]

        Stack: [0x0000008f77900000,0x0000008f77a00000]
        Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
        V [jvm.dll+0xc88c31] os::win32::platform_print_native_stack+0x101 (os_windows_x86.cpp:236)
        V [jvm.dll+0xf25a0f] VMError::report+0x147f (vmError.cpp:1000)
        V [jvm.dll+0xf27f05] VMError::report_and_die+0x645 (vmError.cpp:1819)
        V [jvm.dll+0xf28624] VMError::report_and_die+0x64 (vmError.cpp:1584)
        V [jvm.dll+0x559ecb] report_vm_error+0x5b (debug.cpp:191)
        V [jvm.dll+0xc225ed] Mutex::check_rank+0x22d (mutex.cpp:447)
        V [jvm.dll+0xc22c93] Mutex::lock_without_safepoint_check+0x43 (mutex.cpp:153)
        V [jvm.dll+0x858604] JfrStackTraceRepository::add_trace+0x24 (jfrStackTraceRepository.cpp:202)
        V [jvm.dll+0x858e15] JfrStackTraceRepository::record+0x155 (jfrStackTraceRepository.cpp:166)
        V [jvm.dll+0xfa7e09] JfrEvent<EventZPageAllocation>::write_event+0x119 (jfrEvent.hpp:209)
        V [jvm.dll+0x10070b3] ZPageAllocator::alloc_page+0x403 (zPageAllocator.cpp:744)
        V [jvm.dll+0xfe1ccf] ZHeap::alloc_page+0x3f (zHeap.cpp:227)
        V [jvm.dll+0x100467f] ZObjectAllocator::alloc_page+0x5f (zObjectAllocator.cpp:60)
        V [jvm.dll+0x10044cc] ZObjectAllocator::alloc_object_in_shared_page+0x6c (zObjectAllocator.cpp:94)
        V [jvm.dll+0x10041c6] ZObjectAllocator::alloc_object+0x66 (zObjectAllocator.cpp:162)
        V [jvm.dll+0x1017ece] ZRelocate::relocate_object+0xfe (zRelocate.cpp:379)
        V [jvm.dll+0x181dc] ZBarrier::load_barrier_on_oop_field_preloaded+0x3c (zBarrier.inline.hpp:479)
        V [jvm.dll+0x2d2931] AccessInternal::PostRuntimeDispatch<ZBarrierSet::AccessBarrier<593988,ZBarrierSet>,2,593988>::oop_access_barrier+0x31 (access.inline.hpp:86)
        V [jvm.dll+0x4abc3d] WeakHandle::peek+0x3d (weakHandle.inline.hpp:39)
        V [jvm.dll+0xda87b1] ConcurrentHashTable<StringTableConfig,11>::statistics_calculate<SizeFunc>+0xc1 (concurrentHashTable.inline.hpp:1239)
        V [jvm.dll+0xdaaaf3] StringTable::get_table_statistics+0x113 (stringTable.cpp:628)
        V [jvm.dll+0x8463e5] JfrPeriodicEventSet::requestStringTableStatistics+0x25 (jfrPeriodic.cpp:642)
        V [jvm.dll+0x8321a5] jfr_emit_event+0x85 (jfrJniMethod.cpp:231)
        C 0x000001bd58940ce1 (no source info available)

        The last pc belongs to native nmethod (printed below).
        Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
        J 12002 jdk.jfr.internal.JVM.emitEvent(JJJ)Z jdk.jfr@22-ea (0 bytes) @ 0x000001bd58940c46 [0x000001bd58940bc0+0x0000000000000086]
        J 36336 c1 jdk.jfr.internal.periodic.JVMEventTask.execute(JLjdk/jfr/internal/periodic/PeriodicType;)V jdk.jfr@22-ea (50 bytes) @ 0x000001bd505c5794 [0x000001bd505c53e0+0x00000000000003b4]
        J 36645 c2 jdk.jfr.internal.periodic.PeriodicTask.run(JLjdk/jfr/internal/periodic/PeriodicType;)V jdk.jfr@22-ea (55 bytes) @ 0x000001bd5821c8bc [0x000001bd5821c840+0x000000000000007c]
        J 36852 c1 jdk.jfr.internal.periodic.PeriodicEvents.doPeriodic()J jdk.jfr@22-ea (349 bytes) @ 0x000001bd51940284 [0x000001bd5193edc0+0x00000000000014c4]
        j jdk.jfr.internal.PlatformRecorder.periodicTask()V+47 jdk.jfr@22-ea
        j jdk.jfr.internal.PlatformRecorder.lambda$startDiskMonitor$1()V+1 jdk.jfr@22-ea
        j jdk.jfr.internal.PlatformRecorder$$Lambda+0x000001c1640e0668.run()V+4 jdk.jfr@22-ea
        j java.lang.Thread.runWith(Ljava/lang/Object;Ljava/lang/Runnable;)V+5 java.base@22-ea
        j java.lang.Thread.run()V+19 java.base@22-ea
        v ~StubRoutines::call_stub 0x000001bd578a106f
        Lock stack of current Java thread (top to bottom):


        [error occurred during error reporting (printing code blobs if possible), id 0xe0000000, Internal Error (c:\sb\prod\1697208132\workspace\open\src\hotspot\share\runtime\mutex.cpp:448)]


        This looks like the same failure mode that was addressed via:

        JDK-8317440 Lock rank checking fails when code root set is modified with the Servicelock held after JDK-8315503

        The fix for JDK-8317440 is integrated in jdk-22+19-1444 and
        this failure sighting occurred in jdk-22+20-1513-tier3.

        I'm starting this bug off in hotspot/gc and at P2 just like JDK-8317440.

        Attachments

          Issue Links

            Activity

              People

                tschatzl Thomas Schatzl
                dcubed Daniel Daugherty
                Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: