JVM crashes with SIGSEGV in GC barrier (G1/ZGC) during handshake

XMLWordPrintable

    • x86_64
    • linux

      ADDITIONAL SYSTEM INFORMATION :
      Property settings:
          file.encoding = UTF-8
          file.separator = /
          java.class.path =
          java.class.version = 69.0
          java.home = /usr/java/jdk25
          java.io.tmpdir = /tmp
          java.library.path = /usr/java/packages/lib
              /usr/lib64
              /lib64
              /lib
              /usr/lib
          java.runtime.name = OpenJDK Runtime Environment
          java.runtime.version = 25.0.1+8-LTS
          java.specification.name = Java Platform API Specification
          java.specification.vendor = Oracle Corporation
          java.specification.version = 25
          java.vendor = Eclipse Adoptium
          java.vendor.url = https://adoptium.net/
          java.vendor.url.bug = https://github.com/adoptium/adoptium-support/issues
          java.vendor.version = Temurin-25.0.1+8
          java.version = 25.0.1
          java.version.date = 2025-10-21
          java.vm.compressedOopsMode = Zero based
          java.vm.info = mixed mode, sharing
          java.vm.name = OpenJDK 64-Bit Server VM
          java.vm.specification.name = Java Virtual Machine Specification
          java.vm.specification.vendor = Oracle Corporation
          java.vm.specification.version = 25
          java.vm.vendor = Eclipse Adoptium
          java.vm.version = 25.0.1+8-LTS
          jdk.debug = release
          line.separator = \n
          native.encoding = ANSI_X3.4-1968
          os.arch = amd64
          os.name = Linux
          os.version = 5.15.80-trip20230703.el9.x86_64
          path.separator = :
          stderr.encoding = ANSI_X3.4-1968
          stdin.encoding = ANSI_X3.4-1968
          stdout.encoding = ANSI_X3.4-1968
          sun.arch.data.model = 64
          sun.boot.library.path = /usr/java/jdk25/lib
          sun.cpu.endian = little
          sun.io.unicode.encoding = UnicodeLittle
          sun.java.launcher = SUN_STANDARD
          sun.jnu.encoding = ANSI_X3.4-1968
          sun.management.compiler = HotSpot 64-Bit Tiered Compilers
          user.country = US
          user.dir = /root
          user.home = /root
          user.language = en
          user.name = root

      openjdk version "25.0.1" 2025-10-21 LTS
      OpenJDK Runtime Environment Temurin-25.0.1+8 (build 25.0.1+8-LTS)
      OpenJDK 64-Bit Server VM Temurin-25.0.1+8 (build 25.0.1+8-LTS, mixed mode, sharing)

      A DESCRIPTION OF THE PROBLEM :
      JVM crashes with SIGSEGV in GC barrier (G1/ZGC) during handshake when GetThreadSnapshotClosure accesses java.lang.Thread fields of virtual threads (vthreads) running on ForkJoinWorkerThreads.

      On JDK 25.0.1, both G1GC and ZGC crash with SIGSEGV in their respective AccessBarrier::oop_access_barrier implementations (si_addr ≈ 0x44 or 0x50), indicating a null/dangling oop dereference.

      The crash occurs in the context of:
      - ForkJoinWorkerThread (the default carrier for virtual threads)
      - Handshake triggered by GetThreadSnapshotClosure::do_thread()
      - Which reads oop fields from java.lang.Thread (e.g., via java_lang_Thread::get_thread_status)

      Given that virtual threads are scheduled on ForkJoinPool workers, and their java.lang.Thread instances have short, weakly-referenced lifetimes, it is highly likely that this crash manifests under high virtual thread concurrency (e.g., web servers using Structured Concurrency or Project Loom APIs).

      The issue appears to be that GetThreadSnapshotClosure accesses oop-valued fields of Thread objects without ensuring the referenced objects are still valid during concurrent GC phases. This leads to GC barriers being invoked on invalid oops, causing native crashes.

      We previously reported a crash occurring with ZGC on JDK 25.0.1 involving virtual threads and GetThreadSnapshotClosure. We have now reproduced a nearly identical crash with G1GC under similar conditions, indicating that the underlying issue is not GC-specific but likely rooted in shared runtime or handshake logic.

      REGRESSION : Last worked in version 24

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      An intermittent JVM crash (SIGSEGV) may occur during thread state access, typically after several minutes of runtime. Crash logs (hs_err_*.log), JFR recordings, and heap dumps are generated at the configured paths.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      The JVM should not crash when querying thread status under ZGC Generational mode, even during high-concurrency ForkJoinPool workloads in various cpu structures. Thread state access must be safe and null-checked
      ACTUAL -
      crashed with (SIGSEGV) in 5m - 2days, with rather high probability of 20%-50% in amd epyc 9654.

      ---------- BEGIN SOURCE ----------
      No minimal standalone reproducer available. Crash observed in production Tomcat environment under high-concurrency ForkJoinPool workload with -XX:+ZGenerational or G1GC
      ---------- END SOURCE ----------

            Assignee:
            Patricio Chilano Mateo
            Reporter:
            Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: