On Posix platforms, when mapping memory to specific addresses, the VM should avoid mapping too close behind the program break (brk, sbrk).
Many libc implementations use sbrk to implement malloc. Blocking sbrk from growing by placing a mapping in its way can cause subsequent malloc() and brk()/sbrk() calls to fail, as we have seen repeatedly in the past on Solaris and AIX (see e.g.JDK-8024669 or JDK-8160638).
This is a platform-independent problem though. We saw it mostly on Solaris and AIX because there the program break was typically located in low address ranges and hence could clash with our compressed-pointer-friendly heap or class space reservation. But the same can happen regardless where the program break is located, since our mapping APIs allow mapping at arbitrary addresses. On Linux, blocking the break can impede functioning of the glibc malloc allocator, which also uses the sbrk. The error is almost indistinguishable from a normal native OOM.
IIUC we have dealt with this in a circumvent way by specifying a lowest allowed heap address, HeapBaseMinAddress, and defaulting it to 2G as an attempt to spare low address ranges from too eager heap allocations.
But that is both too strict and not sufficient. Too strict since it protects the low address regions even if the program break does not reside in low areas; so it prevents us e.g. from mapping the heap on Linux on very low addresses even though that would be perfectly possible. It is also not sufficient, since if the program break resides in upper areas it still may be blocked by accidental mappings, but the protection would not work.
On AIX, we always had a different way to deal with this:
https://github.com/openjdk/jdk/blob/0a4e710ff600d001c0464f5b7bb5d3a2cd461c06/src/hotspot/os/aix/os_aix.cpp#L236-L249
where we establish no-fly-zone (with configurable size) behind the break, and prevented os::attempt_reserve_memory_at() to attach there. That leaves os::attempt_reserve_memory_at() free to allocate elsewhere, e.g. below the AIX data segment. It is also configurable, so in cases where we did hit malloc ENOMEM, one of the things to try was always to increase the no-fly-zone size.
I think this would be a sensible approach for all platforms.
Going forward, this could also mean we could get rid of HeapBaseMinAddress, or at least lower its value; that would give us up to 2G more address space to map things into in the coveted lower address regions.
Many libc implementations use sbrk to implement malloc. Blocking sbrk from growing by placing a mapping in its way can cause subsequent malloc() and brk()/sbrk() calls to fail, as we have seen repeatedly in the past on Solaris and AIX (see e.g.
This is a platform-independent problem though. We saw it mostly on Solaris and AIX because there the program break was typically located in low address ranges and hence could clash with our compressed-pointer-friendly heap or class space reservation. But the same can happen regardless where the program break is located, since our mapping APIs allow mapping at arbitrary addresses. On Linux, blocking the break can impede functioning of the glibc malloc allocator, which also uses the sbrk. The error is almost indistinguishable from a normal native OOM.
IIUC we have dealt with this in a circumvent way by specifying a lowest allowed heap address, HeapBaseMinAddress, and defaulting it to 2G as an attempt to spare low address ranges from too eager heap allocations.
But that is both too strict and not sufficient. Too strict since it protects the low address regions even if the program break does not reside in low areas; so it prevents us e.g. from mapping the heap on Linux on very low addresses even though that would be perfectly possible. It is also not sufficient, since if the program break resides in upper areas it still may be blocked by accidental mappings, but the protection would not work.
On AIX, we always had a different way to deal with this:
https://github.com/openjdk/jdk/blob/0a4e710ff600d001c0464f5b7bb5d3a2cd461c06/src/hotspot/os/aix/os_aix.cpp#L236-L249
where we establish no-fly-zone (with configurable size) behind the break, and prevented os::attempt_reserve_memory_at() to attach there. That leaves os::attempt_reserve_memory_at() free to allocate elsewhere, e.g. below the AIX data segment. It is also configurable, so in cases where we did hit malloc ENOMEM, one of the things to try was always to increase the no-fly-zone size.
I think this would be a sensible approach for all platforms.
Going forward, this could also mean we could get rid of HeapBaseMinAddress, or at least lower its value; that would give us up to 2G more address space to map things into in the coveted lower address regions.
- relates to
-
JDK-8160638 Solaris JVM unable to allocate more than 2GB of direct byte buffers when max heap is <= 2GB
- Open
-
JDK-8024669 Native OOME when allocating after changes to maximum heap supporting Coops sizing on sparcv9
- Closed