Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: P3
Fix Version/s: None
Affects Version/s: 17.0.10
Component/s: hotspot
Labels:
- sustaining
Environment:

Hide

Encountered with JDK 17.0.8 and 17.0.10

OS Info
Red Hat Enterprise Linux 8.9 (Ootpa)
Host: AMD EPYC 7542 32-Core Processor, 128 cores, 503G, Red Hat Enterprise Linux release 8.9 (Ootpa)
Kernel: Linux 4.18.0-513.9.1.el8_9.x86_64 #1 SMP Thu Nov 16 10:29:04 EST 2023 x86_64 x86_64 x86_64 GNU/Linux
Architecture: x86_64
Processors: 128 CPU

Show
Encountered with JDK 17.0.8 and 17.0.10 OS Info Red Hat Enterprise Linux 8.9 (Ootpa) Host: AMD EPYC 7542 32-Core Processor, 128 cores, 503G, Red Hat Enterprise Linux release 8.9 (Ootpa) Kernel: Linux 4.18.0-513.9.1.el8_9.x86_64 #1 SMP Thu Nov 16 10:29:04 EST 2023 x86_64 x86_64 x86_64 GNU/Linux Architecture: x86_64 Processors: 128 CPU

Subcomponent:
runtime
CPU:

x86_64
OS:

linux_redhat_8.0

This seems like a related but slightly different case than fixed in ~~JDK-8308766~~

We see a crash with SIGFPE at ThreadLocalAllocBuffer::initial_desired_size()

It is reproducible only on a specific set of machines and is not visible anywhere else with the same application. Not sure what is necessary to reproduce it elsewhere.

One possible candidate for a SIGFPE in the code is

init_sz = (Universe::heap()->tlab_capacity(thread()) / HeapWordSize) /
(nof_threads * target_refills());

at https://github.com/openjdk/jdk17u/blob/jdk-17.0.10-ga/src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp#L280

HeapWordSize seems to be a constant, but maybe either nof_threads or target_refills() can be zero in some cases?

bits from hs_err_pid:

#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGFPE (0x8) at pc=0x00007efeed9b5b9c, pid=3048299, tid=3050463
#
# JRE version: (17.0.10+7) (build )
# Java VM: OpenJDK 64-Bit Server VM (17.0.10+7, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V [libjvm.so+0xe68b9c] ThreadLocalAllocBuffer::initial_desired_size()+0x10c

Stack: [0x00007efd9c21e000,0x00007efd9ca1e000], sp=0x00007efd9ca1cd20, free space=8187k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0xe68b9c] ThreadLocalAllocBuffer::initial_desired_size()+0x10c
V [libjvm.so+0xe68be4] ThreadLocalAllocBuffer::initialize()+0x24
V [libjvm.so+0x8bfec4] attach_current_thread.part.0+0x94
V [libjvm.so+0x8c023d] jni_AttachCurrentThread+0x6d
C 0x00007efd9cb5b701
C 0x00007efd9cb5ba4e

Potential workarounds:
* Disable TLAB with -XX:-UseTLAB - may have large performance impact
* Configure an initial"TLABSize" via JVM parameters -XX:TLABSize=... to try to avoid code-branch which crashes (https://github.com/openjdk/jdk17u/blob/jdk-17.0.10-ga/src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp#L273) - e.g. -XX:TLABSize=2k (must be between 1k and 512k), seems the JDK will only use this as "initial" size and resize properly afterwards, see https://answers.ycrash.io/question/what-is-jvm-startup-parameter--xxtlabsize?q=833

"chatty" logging for tlab-size can be enabled via -Xlog:tlab*=debug,tlab*=trace:file=gc.log:time:filecount=7,filesize=8M (edited)

relates to

JDK-8308341 JNI_GetCreatedJavaVMs returns a partially initialized JVM

Resolved

JDK-8308766 TLAB initialization may cause div by zero

Resolved

Assignee:: Unassigned

Reporter:: Dominik Stadler

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2024-04-17 01:42

Updated:: 2024-11-26 05:50

Details

Description

Attachments

Issue Links

Activity

People

Dates