Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8293472

Incorrect container resource limit detection if manual cgroup fs mounts present

    XMLWordPrintable

Details

    • b16
    • generic
    • linux

    Backports

      Description

        On some systems where there are multiple cgroup fs mount entries in /proc/self/mountinfo the detected resource limits might be wrong as the path to the cgroup interface files might be wrong.

        The symptom on cg1 with a debug vm is similar to JDK-8253435. It will assert:

        #
        # A fatal error has been detected by the Java Runtime Environment:
        #
        # Internal Error (/data/openjdk/jdk/src/hotspot/os/linux/cgroupSubsystem_linux.cpp:335), pid=578, tid=583
        # assert(cg_infos[3]._mount_path == __null) failed: stomping of _mount_path
        #
        # JRE version: (20.0) (fastdebug build )
        # Java VM: OpenJDK 64-Bit Server VM (fastdebug 20-internal-adhoc.root.jdk, mixed mode, sharing, tiered, unknown gc, linux-amd64)
        # Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e" (or dumping to /data1/test/java/2022-09-06-21-58-28/core.578)
        #
        #

        --------------- S U M M A R Y ------------

        Command Line: Test

        Host: VM-235-31-centos, AMD EPYC 7K62 48-Core Processor, 16 cores, 31G, Ubuntu 20.04.4 LTS
        Time: Wed Sep 7 15:06:04 2022 CST elapsed time: 0.002658 seconds (0d 0h 0m 0s)

        --------------- T H R E A D ---------------

        Current thread is native thread

        Stack: [0x00007ffff569b000,0x00007ffff579c000], sp=0x00007ffff5794290, free space=996k
        Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
        V [libjvm.so+0x19cded2] VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x1a2 (cgroupSubsystem_linux.cpp:335)
        V [libjvm.so+0x19ced8f] VMError::report_and_die(Thread*, void*, char const*, int, char const*, char const*, __va_list_tag*)+0x2f (vmError.cpp:1466)
        V [libjvm.so+0xac790b] report_vm_error(char const*, int, char const*, char const*, ...)+0x11b (debug.cpp:284)
        V [libjvm.so+0x8d465b] CgroupSubsystemFactory::determine_type(CgroupInfo*, char const*, char const*, char const*, unsigned char*)+0xabb (cgroupSubsystem_linux.cpp:335)
        V [libjvm.so+0x8d5236] CgroupSubsystemFactory::create()+0xe6 (cgroupSubsystem_linux.cpp:53)
        V [libjvm.so+0x14f7011] OSContainer::init()+0x71 (osContainer_linux.cpp:57)
        V [libjvm.so+0x64eacc] Arguments::parse_vm_init_args(JavaVMInitArgs const*, JavaVMInitArgs const*, JavaVMInitArgs const*, JavaVMInitArgs const*)+0x16c (os.hpp:243)
        V [libjvm.so+0x64ef5d] Arguments::parse(JavaVMInitArgs const*)+0x47d (arguments.cpp:4014)
        V [libjvm.so+0x1913b5a] Threads::create_vm(JavaVMInitArgs*, bool*)+0x9a (threads.cpp:453)
        V [libjvm.so+0x1016539] JNI_CreateJavaVM+0x99 (jni.cpp:3628)
        C [libjli.so+0x40fa] JavaMain+0x8a (java.c:1457)
        C [libjli.so+0x7859] ThreadJavaMain+0x9 (java_md.c:650)

        On cg2 it might continue but uses an incorrect container limit value. See the comment below.

        On a cg1 system an additional symptom is a warning on 'java -version':

        [0.000s][warning][os,container] Duplicate cpuset controllers detected. Picking /sys/fs/cgroup/cpuset, skipping /cgroup-in/cpuset.

        Attachments

          Issue Links

            Activity

              People

                wchao Wang Chao
                wchao Wang Chao
                Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: