Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8330470

TLAB initialization may cause div by zero

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: P3 P3
    • None
    • 17.0.10
    • hotspot
    • x86_64
    • linux_redhat_8.0

      This seems like a related but slightly different case than fixed in JDK-8308766

      We see a crash with SIGFPE at ThreadLocalAllocBuffer::initial_desired_size()

      It is reproducible only on a specific set of machines and is not visible anywhere else with the same application. Not sure what is necessary to reproduce it elsewhere.


      One possible candidate for a SIGFPE in the code is

       init_sz = (Universe::heap()->tlab_capacity(thread()) / HeapWordSize) /
                            (nof_threads * target_refills());

      at https://github.com/openjdk/jdk17u/blob/jdk-17.0.10-ga/src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp#L280

      HeapWordSize seems to be a constant, but maybe either nof_threads or target_refills() can be zero in some cases?

      bits from hs_err_pid:

      #
      # A fatal error has been detected by the Java Runtime Environment:
      #
      # SIGFPE (0x8) at pc=0x00007efeed9b5b9c, pid=3048299, tid=3050463
      #
      # JRE version: (17.0.10+7) (build )
      # Java VM: OpenJDK 64-Bit Server VM (17.0.10+7, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
      # Problematic frame:
      # V [libjvm.so+0xe68b9c] ThreadLocalAllocBuffer::initial_desired_size()+0x10c


      Stack: [0x00007efd9c21e000,0x00007efd9ca1e000], sp=0x00007efd9ca1cd20, free space=8187k
      Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
      V [libjvm.so+0xe68b9c] ThreadLocalAllocBuffer::initial_desired_size()+0x10c
      V [libjvm.so+0xe68be4] ThreadLocalAllocBuffer::initialize()+0x24
      V [libjvm.so+0x8bfec4] attach_current_thread.part.0+0x94
      V [libjvm.so+0x8c023d] jni_AttachCurrentThread+0x6d
      C 0x00007efd9cb5b701
      C 0x00007efd9cb5ba4e


      Potential workarounds:
      * Disable TLAB with -XX:-UseTLAB - may have large performance impact
      * Configure an initial"TLABSize" via JVM parameters -XX:TLABSize=... to try to avoid code-branch which crashes (https://github.com/openjdk/jdk17u/blob/jdk-17.0.10-ga/src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp#L273) - e.g. -XX:TLABSize=2k (must be between 1k and 512k), seems the JDK will only use this as "initial" size and resize properly afterwards, see https://answers.ycrash.io/question/what-is-jvm-startup-parameter--xxtlabsize?q=833

      "chatty" logging for tlab-size can be enabled via -Xlog:tlab*=debug,tlab*=trace:file=gc.log:time:filecount=7,filesize=8M (edited)

            Unassigned Unassigned
            dstadler Dominik Stadler
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: