Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8292083

Detected container memory limit may exceed physical machine memory

XMLWordPrintable

    • 10
    • b13
    • generic
    • linux

        The current Hotspot code in osContainer_linux.cpp uses this code snippet in OSContainer::init() to set the physical memory of the VM based on the detected container limits:

          // We need to update the amount of physical memory now that
          // cgroup subsystem files have been processed.
          if ((mem_limit = cgroup_subsystem->memory_limit_in_bytes()) > 0) {
            os::Linux::set_physical_memory(mem_limit);
            log_info(os, container)("Memory Limit is: " JLONG_FORMAT, mem_limit);
          }

        If the detection mechanism so happens to find a value in cgroup limit files that is larger than physical memory of the host system the container runs on this will happily proceed, resulting in a broken JVM at risk to getting OOM killed etc.

        This seems to be present since the initial JDK-8146115 code done in JDK 10:
        https://hg.openjdk.java.net/jdk/jdk/rev/7f22774a5f42#l4.43

        We have seen cgroup v1 systems (see trace attachment) that didn't have any cgroups limits in effect, and had this value in /sys/fs/cgroup/memory/memory.limit_in_bytes: 92233720365056. That value exceeded the physical host's memory of 8 GB total. Nevertheless, the cgroups v1 files don't have a unique value to say "max" or unlimited like in cgroups v2. Therefore a contrived "unlimited" value is being used to check if the value is a limit or "unlimited". _unlimited_memory is set for cgroups v1 to '(LONG_MAX / os::vm_page_size()) * os::vm_page_size(), taking on value 9223372036854771712 on some systems. Thus, the limit ends up being 92233720365056 as that's less than 9223372036854771712[1]. Any larger value in memory.limit_in_bytes cgroup interface files that are smaller than (LONG_MAX / os::vm_page_size()) * os::vm_page_size(), but exceeding physical memory would run afoul of this bug.

        We should bound the container memory above by the physical host's memory at the very least.

        [1] https://github.com/openjdk/jdk/blob/3677b55b45746c3c955a8fcf1fbbf15694baa873/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L94

              jdowland Jonathan Dowland
              sgehwolf Severin Gehwolf
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: