SIGSEGV in ZBarrierSet::AccessBarrier during virtual thread status query

XMLWordPrintable

    • x86_64
    • linux

      ADDITIONAL SYSTEM INFORMATION :
      system/os/java runtime infomation
      Category Value
      OS: AlmaLinux release 9.3 (Shamrock Pampas Cat)
      OS Architecture: x86_64 (AMD EPYC 9654 96-Core Processor/ Intel(R) Xeon(R) Gold 5318Y 96 cores)
      CPU Cores: actual using 16 Logical processors in container
      Memory: 32 GB RAM
      JDK Version: OpenJDK 25.0.1+8 (Temurin-25.0.1+8, LTS)
      JVM: OpenJDK 64-Bit Server VM
      JVM Parameters: -Xmx26624m -Djava.util.concurrent.ForkJoinPool.common.parallelism=16 -XX:+UseZGC -XX:+ZGenerational
      GC Mode: ZGC (Generational)
      Crash Time: Wed Nov 12 15:47:31 2025 CST (Elapsed: 1d 13h 41m 36s for ZGC)

      A DESCRIPTION OF THE PROBLEM :
      The JVM crashes (SIGSEGV) when running in ZGC generational mode. The crash stack trace points to the ZGC read/write barrier (ZBarrierSet::AccessBarrier) while accessing virtual thread state (java_lang_Thread::get_thread_status).

      Appendix: Historical Context
      The crash, previously seen only on AMD EPYC 9654 with JDK 25.0.0, has now been reproduced on Intel CPUs using JDK 25.0.1.There may be a higher incidence of crashes on AMD EPYC 9654 (Zen 4) systems compared to Intel platforms, though further testing is needed to confirm.
      Prior to the ZGC-related crashes reported above, we also observed a similar JVM crash on JDK 25.0.0 when using G1GC (i.e., without -XX:+UseZGC) on AMD EPYC systems — specifically on Alibaba Cloud custom servers (model: 9T24)(Notably, this G1GC crash has not been reproduced during 24h test under same conditions with JDK 25.0.1,though further testing is needed to confirm). The crash exhibits a strikingly similar stack trace pattern, with the top of the native stack involving barrier dispatch during a handshake operation triggered by Unsafe.unpark(), as shown below:
      Stack: [0x00007fac45d00000,0x00007fac45d40000], sp=0x00007fac45d3d028, free space=244k
      Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
      V [libjvm.so+0x4ac640] AccessInternal::PostRuntimeDispatch<G1BarrierSet::AccessBarrier<286822ul, G1BarrierSet>, (AccessInternal::BarrierType)3, 286822ul>::oop_access_barrier(oopDesc*, long)+0x0
      V [libjvm.so+0x11775a9] GetThreadSnapshotClosure::do_thread(Thread*)+0x6e9
      V [libjvm.so+0x9b5237] HandshakeOperation::do_handshake(JavaThread*)+0x47
      V [libjvm.so+0x9b5384] HandshakeState::process_by_self(bool, bool)+0xa4
      V [libjvm.so+0xf36995] SafepointMechanism::process(JavaThread*, bool, bool)+0x65
      V [libjvm.so+0x11a7e07] Unsafe_Unpark+0x147
      J 8523 jdk.internal.misc.Unsafe.unpark(Ljava/lang/Object;)V java.base@25
      J 50374 c2 java.lang.VirtualThread.runContinuation()V java.base@25
      J 47465 c2 java.util.concurrent.ForkJoinTask$InterruptibleTask.exec()Z java.base@25
      J 40336% c2 java.util.concurrent.ForkJoinPool.runWorker(Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V java.base@25
      j java.util.concurrent.ForkJoinWorkerThread.run()V+31 java.base@25


      REGRESSION : Last worked in version 25.0.1

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Environment:
      CPU: 16 logical processors (e.g., AMD EPYC 9654 or equivalent)
      Memory: 32 GB RAM
      OS: Linux (e.g., AlmaLinux 9.3)
      JDK: OpenJDK 25.0.1+8
      Start the JVM with the following essential and crash-relevant options (full command cleaned of duplicates):
      java \
        --add-opens=java.base/jdk.internal.loader=ALL-UNNAMED \
        --add-opens=java.management/sun.management=ALL-UNNAMED \
        --add-opens=java.management/java.lang.management=ALL-UNNAMED \
        --add-opens=java.base/java.lang=ALL-UNNAMED \
        --add-opens=java.base/java.lang.reflect=ALL-UNNAMED \
        --add-opens=java.base/sun.reflect.annotation=ALL-UNNAMED \
        --add-opens=java.base/java.math=ALL-UNNAMED \
        --add-opens=java.base/java.util=ALL-UNNAMED \
        --add-opens=java.base/sun.util.calendar=ALL-UNNAMED \
        --add-opens=java.base/java.io=ALL-UNNAMED \
        --add-opens=java.base/java.net=ALL-UNNAMED \
        --add-opens=java.base/java.util.concurrent=ALL-UNNAMED \
        --add-opens=java.xml/com.sun.org.apache.xerces.internal.jaxp.datatype=ALL-UNNAMED \
        --add-opens=jdk.internal.jvmstat/sun.jvmstat.monitor=ALL-UNNAMED \
        --add-opens=java.rmi/sun.rmi.transport=ALL-UNNAMED \
        \
        -Djava.util.logging.config.file=/opt/tomcat/conf/logging.properties \
        -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager \
        -javaagent:/opt/tomcat/rasp/basic.jar \
        \
        -Xms26624m \
        -Xmx26624m \
        -Xss256k \
        -XX:MetaspaceSize=128m \
        -XX:MaxMetaspaceSize=256m \
        \
        -XX:+UnlockExperimentalVMOptions \
        -XX:+UseZGC \
        -XX:+ZGenerational \
        \
        -Djava.util.concurrent.ForkJoinPool.common.parallelism=16 \
        -Djava.util.concurrent.ForkJoinPool.common.threadFactory=com.biz.forkjoinworkerthreadfactory.bizForkJoinWorkerThreadFactory \
        \
        -XX:+AlwaysPreTouch \
        -XX:ActiveProcessorCount=16 \
        \
        -XX:+HeapDumpOnOutOfMemoryError \
        -XX:HeapDumpPath=/opt/logs/service/java-rservice-12345-27wts.hprof \
        -XX:OnOutOfMemoryError=/opt/container/tools/heapdump_upload.sh \
        \
        -Xlog:gc*:file=/opt/logs/service/gc.log:tags,time,uptime,pid:filecount=5,filesize=32M \
        \
        -XX:StartFlightRecording=disk=true,maxsize=5000m,maxage=2d,settings=default.jfc \
        -XX:FlightRecorderOptions=maxchunksize=64m,repository=/opt/logs/service,stackdepth=64 \
        \
        -Djdk.attach.allowAttachSelf=true \
        -XX:+EnableDynamicAgentLoading \
        -XX:-OmitStackTraceInFastThrow \
        \
        -Djava.security.egd=file:/dev/./urandom \
        -Dcatalina.base=/opt/tomcat \
        -Dcatalina.home=/opt/tomcat \
        -Djava.io.tmpdir=/opt/tomcat/temp \
        \


      Run a sustained medium-intensity workload that maintains average CPU utilization between 30% and 50% (measured across all 16 logical processors), achieved by a mix of:
      Virtual-thread-based I/O-bound requests (e.g., HTTP API calls via Tomcat), and
      Periodic CPU-bound tasks submitted to the common ForkJoinPool (parallelism=16).
      Let the application run for 10 minutes to 48 hours under sustained load.

      Observe:

      An intermittent JVM crash (SIGSEGV) may occur in ZBarrierSet::AccessBarrier during thread state access, typically after several minutes of runtime. Crash logs (hs_err_*.log), JFR recordings, and heap dumps are generated at the configured paths.


      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      The JVM should not crash when querying thread status under ZGC Generational mode, even during high-concurrency ForkJoinPool workloads in various cpu structures. Thread state access must be safe and null-checked
      ACTUAL -
      crashed with
      --------------- T H R E A D ---------------

      Current thread (0x00007fbf84003620): JavaThread "ForkJoinPool-1-worker-3" daemon [_thread_in_vm, id=583, stack(0x00007fc4b1bad000,0x00007fc4b1bed000) (256K)]

      Stack: [0x00007fc4b1bad000,0x00007fc4b1bed000], sp=0x00007fc4b1bea050, free space=244k
      Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
      V [libjvm.so+0x4ad3e6] AccessInternal::PostRuntimeDispatch<ZBarrierSet::AccessBarrier<286790ul, ZBarrierSet>, (AccessInternal::BarrierType)3, 286790ul>::oop_access_barrier(oopDesc*, long)+0x16
      V [libjvm.so+0xa27a51] java_lang_Thread::get_thread_status(oopDesc*)+0x11
      V [libjvm.so+0x1177929] GetThreadSnapshotClosure::do_thread(Thread*)+0x6e9
      V [libjvm.so+0x9b5477] HandshakeOperation::do_handshake(JavaThread*)+0x47
      V [libjvm.so+0x9b55c4] HandshakeState::process_by_self(bool, bool)+0xa4
      V [libjvm.so+0xf36c55] SafepointMechanism::process(JavaThread*, bool, bool)+0x65
      V [libjvm.so+0xa1462a] InterpreterRuntime::at_safepoint(JavaThread*)+0x16a
      j java.util.concurrent.ForkJoinPool.runWorker(Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V+388 java.base@25.0.1
      j java.util.concurrent.ForkJoinWorkerThread.run()V+31 java.base@25.0.1
      v ~StubRoutines::call_stub 0x00007fc4f7946f9f
      V [libjvm.so+0xa21640] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, JavaThread*)+0x2b0
      V [libjvm.so+0xa22f7f] JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, JavaThread*)+0x1df
      V [libjvm.so+0xb137cc] thread_entry(JavaThread*, JavaThread*)+0x8c
      V [libjvm.so+0xa3bd1e] JavaThread::thread_main_inner()+0x1de
      V [libjvm.so+0x116ab9f] Thread::call_run()+0x9f
      V [libjvm.so+0xe71cc6] thread_native_entry(Thread*)+0xd6
      Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
      j java.util.concurrent.ForkJoinPool.runWorker(Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V+388 java.base@25.0.1
      j java.util.concurrent.ForkJoinWorkerThread.run()V+31 java.base@25.0.1
      v ~StubRoutines::call_stub 0x00007fc4f7946f9f

      siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000050

      Registers:
      RAX=0x0000000000000000, RBX=0x0000000000000001, RCX=0x000000000000000d, RDX=0x00007fc50e9ddbe0
      RSP=0x00007fc4b1bea050, RBP=0x00007fc4b1bea070, RSI=0x0000000000000050, RDI=0x0000000000000000
      R8 =0x00000400a9610248, R9 =0x0000000000000000, R10=0x0000000000000006, R11=0x0000000000000246
      R12=0x0000000000000050, R13=0x00007fc50edc4aa0, R14=0x00007fc4b1beb690, R15=0x00007fbfde9bde80
      RIP=0x00007fc50daf13e6, EFLAGS=0x0000000000010247, CSGSFS=0x002b000000000033, ERR=0x0000000000000004
        TRAPNO=0x000000000000000e

      XMM[0]=0x0000000000000000 0x0000000000000000
      XMM[1]=0x0000000000000000 0x0000040574b64bb8
      XMM[2]=0x0c98f5818685ba9f 0x4a319941fc018edf
      XMM[3]=0xee830681e1ddc99f 0x59db6a41e191dddf
      XMM[4]=0x000e178101b4d89f 0x34e63b4167e12cdf
      XMM[5]=0x54a5cbdf48d86714 0x797abd745399f298
      XMM[6]=0xacc31adf05b72b7d 0x6e57dba94ca9d4f3
      XMM[7]=0xd3bdaae9fe5e555a 0x81232a683b68cd3d
      XMM[8]=0x058f45e5c5f2280c 0x0b3c73a14245fb4e
      XMM[9]=0xacadcb1dacadcb1d 0xacadcb1d283abe1f
      XMM[10]=0x0000000000000000 0x000000007b8cf302
      XMM[11]=0x19328acd19328acd 0x19328acd416d48ec
      XMM[12]=0x0000000000000000 0x00000000283abe1f
      XMM[13]=0x0000000000000000 0x3ff0000000000000
      XMM[14]=0x0000000000000000 0x0000000000000001
      XMM[15]=0x0000000000000000 0x0000000000000009
        MXCSR=0x00001fa6


      Register to memory mapping:

      RAX=0x0 is null
      RBX=0x0000000000000001 is an unknown value
      RCX=0x000000000000000d is an unknown value
      RDX=0x00007fc50e9ddbe0: <offset 0x0000000001399be0> in /usr/java/jdk25/lib/server/libjvm.so at 0x00007fc50d644000
      RSP=0x00007fc4b1bea050 is pointing into the stack for thread: 0x00007fbf84003620
      RBP=0x00007fc4b1bea070 is pointing into the stack for thread: 0x00007fbf84003620
      RSI=0x0000000000000050 is an unknown value
      RDI=0x0 is null
      R8 =0x00000400a9610248 is a zaddress: java.util.concurrent.ForkJoinPool$WorkQueue
      {0x00000400a9610248} - klass: 'java/util/concurrent/ForkJoinPool$WorkQueue' - flags:

       - ---- fields (total size 40 words):
       - 'base' 'I' @12 3513851 (0x00359dfb)
       - final 'config' 'I' @16 1 (0x00000001)
       - final 'owner' 'Ljava/util/concurrent/ForkJoinWorkerThread;' @24 a 'jdk/internal/misc/CarrierThread'{0x00000400a8401080} (0x0080150802101630)
       - 'array' '[Ljava/util/concurrent/ForkJoinTask;' @32 a 'java/util/concurrent/ForkJoinTask'[64] {0x000004000066c7a0} (0x0080000cd8f41630)
       - 'top' 'I' @168 3513851 (0x00359dfb)
       - volatile 'phase' 'I' @172 -1508245489 (0xa61a000f)
       - 'stackPred' 'I' @176 -294387695 (0xee740011)
       - volatile 'source' 'I' @180 24 (0x00000018)
       - 'nsteals' 'I' @184 63888359 (0x03cedbe7)
       - volatile 'parking' 'I' @188 0 (0x00000000)
      R9 =0x0 is null
      R10=0x0000000000000006 is an unknown value
      R11=0x0000000000000246 is an unknown value
      R12=0x0000000000000050 is an unknown value
      R13=0x00007fc50edc4aa0: <offset 0x0000000001780aa0> in /usr/java/jdk25/lib/server/libjvm.so at 0x00007fc50d644000
      R14=0x00007fc4b1beb690 is pointing into the stack for thread: 0x00007fbf84003620
      R15=0x00007fbfde9bde80 is pointing into the stack for thread: 0x00007fc3d8015b80

      ---------- BEGIN SOURCE ----------
      No minimal standalone reproducer available. Crash observed in production Tomcat environment under high-concurrency ForkJoinPool workload with -XX:+ZGenerational.

      However, the crash consistently occurs after 10 minutes to 2 days of runtime with the following JVM flags:
        -XX:+UseZGC -XX:+ZGenerational -Xmx26g -Djava.util.concurrent.ForkJoinPool.common.parallelism=16
      ---------- END SOURCE ----------

            Assignee:
            Patricio Chilano Mateo
            Reporter:
            Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: