Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8261242

[Linux] OSContainer::is_containerized() returns true when run outside a container

XMLWordPrintable

    • b05
    • generic
    • linux

      Currently the code in Hotspot in order to determine whether or not the JVM thinks it runs in a container may return false positives on a plain Linux host.

      This can be observed for example by running jshell with container trace logging (it shows many traces since -XX:+UseDynamicNumberOfCompilerThreads is on by default, which queries for available memory going through the container detection code):
      $ jshell -J-Xlog:os+container=trace

      Bob mentions that there wasn't a reliable way to detect whether or not a JVM runs in a container:

      https://bugs.openjdk.java.net/browse/JDK-8227006?focusedCommentId=14275609&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14275609

      I believe this changed. We should be able to determine whether we run in a container by looking at the controller mounts inside a container. Container engines typically mount them read-only, while on a host system they are read write. This is useful to detect the "inside a container case". Note that the mount options are field 6 as per 'man procfs' under /proc/pid/mountinfo.

      Host system case (note the 'rw' mount options):
      $ grep cgroup /proc/self/mountinfo
      53 51 0:27 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:7 - tmpfs tmpfs ro,seclabel,size=4096k,nr_inodes=1024,mode=755,inode64
      54 53 0:28 / /sys/fs/cgroup/unified rw,nosuid,nodev,noexec,relatime shared:8 - cgroup2 cgroup2 rw,seclabel,nsdelegate
      55 53 0:29 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 - cgroup cgroup rw,seclabel,xattr,name=systemd
      56 53 0:33 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:10 - cgroup cgroup rw,seclabel,blkio
      57 53 0:34 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime shared:11 - cgroup cgroup rw,seclabel,net_cls,net_prio
      58 53 0:35 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:12 - cgroup cgroup rw,seclabel,cpu,cpuacct
      59 53 0:36 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:13 - cgroup cgroup rw,seclabel,pids
      60 53 0:37 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:14 - cgroup cgroup rw,seclabel,memory
      61 53 0:38 / /sys/fs/cgroup/rdma rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,seclabel,rdma
      62 53 0:39 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:16 - cgroup cgroup rw,seclabel,freezer
      63 53 0:40 / /sys/fs/cgroup/misc rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,seclabel,misc
      64 53 0:41 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime shared:18 - cgroup cgroup rw,seclabel,perf_event
      65 53 0:42 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:19 - cgroup cgroup rw,seclabel,hugetlb
      66 53 0:43 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:20 - cgroup cgroup rw,seclabel,cpuset
      67 53 0:44 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,seclabel,devices

      Container case (note the 'ro' mount options):
      # grep cgroup /proc/self/mountinfo
      1531 1508 0:119 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - tmpfs cgroup rw,context="system_u:object_r:container_file_t:s0:c405,c449",size=1024k,uid=15263,gid=15263,inode64
      1532 1531 0:44 /user.slice /sys/fs/cgroup/devices ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,devices
      1533 1531 0:43 / /sys/fs/cgroup/cpuset ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,cpuset
      1534 1531 0:42 / /sys/fs/cgroup/hugetlb ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,hugetlb
      1535 1531 0:41 / /sys/fs/cgroup/perf_event ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,perf_event
      1536 1531 0:40 / /sys/fs/cgroup/misc ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,misc
      1537 1531 0:39 / /sys/fs/cgroup/freezer ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,freezer
      1538 1531 0:38 / /sys/fs/cgroup/rdma ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,rdma
      1539 1531 0:37 /user.slice/user-15263.slice/user@15263.service /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory
      1540 1531 0:36 /user.slice/user-15263.slice/user@15263.service /sys/fs/cgroup/pids ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,pids
      1541 1531 0:35 /user.slice/user-15263.slice/user@15263.service /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,cpu,cpuacct
      1542 1531 0:34 / /sys/fs/cgroup/net_cls,net_prio ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,net_cls,net_prio
      1543 1531 0:33 /user.slice/user-15263.slice/user@15263.service /sys/fs/cgroup/blkio ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,blkio
      1544 1531 0:29 /user.slice/user-15263.slice/user@15263.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-0f301a31-cd1d-4b62-b798-9810bc79990b.scope /sys/fs/cgroup/systemd ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,xattr,name=systemd

      Yet, looking at rw/ro mount options isn't enough. Features like JDK-8217338 have been added to use the container detection code to figure out memory/cpu limits enforced by other means. We'd be introducing a regression when we only looked at the read/write property of controller mounts. Therefore, we need a fall-back to look at the container limits at OSContainer::init time. If there are any, we could set OSContainer::is_containerized() to true for that reason.

      Using the fall-back approach only is insufficient since it's expected (asserted in container tests), for the when OpenJDK runs inside a container, (without a limit) to return is_containerized() = true.

            sgehwolf Severin Gehwolf
            sgehwolf Severin Gehwolf
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: