Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8299800 | 17.0.7 | Thomas Stuefe | P4 | Resolved | Fixed | b01 |
C-heap buffer overflow checks in the hotspot exist, but the way they are implemented has disadvantages and they can be improved.
At the moment we have a hard-wired buffer overflow recognition in debug builds in `os::malloc()` and friends. We also have NMT, which uses its own malloc headers but does no overflow checks. We have a third layer of guarding done when using `+CheckJNICalls`, also using headers, but that one only affects external JNI functions allocating C-heap. I may have missed some other places. But a hotspot C-heap allocation can accumulate a lot of headers, like barnacles on a ship.
Disadvantages of the current solution:
- Discounting `+CheckJNICalls` - which only covers a tiny subset of C-heap allocations - we have no way to do C-heap checking in release builds. But there, it is sorely missed. We ship release VMs, and those VMs get used in myriad ways with a lot of faulty third-party native code. I would love to be able to flip a product switch at a customer site and have some basic C-heap checks done, without relegating to external tools or debug c-libs.
- The debug-only guards in os::malloc() are quite fat, really, a whopping 48 bytes per allocation on 64-bit, 40 bytes on 32-bit. That is for guarding alone. They distort the memory allocation picture, since blowing up every allocation this way causes the underlying libc to do different things. Therefore we have different memory layouts and allocation patterns between debug and release. In addition, we have different code paths too, e.g. in debug os::realloc calls os::malloc + os::free whereas in release builds it calls directly into libc ::realloc. All that means that in debug builds we test something different than what we ship in release builds.
- The canary in the headers of the debug-only guards do not directly precede the user portion of the data, so we won't catch negative buffer overflows of only a few bytes.
- The guarding added by CheckJNICalls is unnecessarily expensive too, since it copies the memory around, handing a copy of the guarded memory up to the caller.
- The fact that three different code sections all do malloc headers incurs unnecessary costs, and the code is unnecessarily complex. It makes also statistics difficult to understand since the silent overhead can be large (compare the rise in RSS with the rise in NMT allocations in a debug build).
- None of the current overflow checkers print out hex dumps of the violated memory. That is what the libc usually does and it is very useful.
Proposal for a better solution:
- Let NMT do buffer overrun checks:
- NMT is in a perfect position to do so, we are almost there already. It is very well thought out and its malloc header implementation is tight and frugal. At least on 64-bit, we could add overflow canaries to the header without blowing it up, there is still space.
- NMT needs headers to work, so it will access them on each allocation and deallocation. Checking a canary at that point does not add much work.
- Then let us discard the current debug-only solution.
- We would save memory since the current solution is hefty: 48 bytes vs 16 bytes NMT has. If we add footers to NMT, it is 16 + footer, still way less than what we pay now.
- We can throw away coding and make os::malloc and friends simpler
- In debug, we are closer to what we run in release. We can use overflow recognition in both debug and release by simply switching on NMT. We can reproduce the same allocation pattern between debug and release.
- If we miss "always-on buffer overflow recognition" in debug, we can just switch NMT on by default in debug to NMT-level="summary". That does not cost much at all. I believe footprint-wise it would still be cheaper than today's debug-only guards. Also, sinceJDK-8256844 switching on NMT is easier, since `-XX:NativeMemoryTracking` behaves like any other normal option.
At the moment we have a hard-wired buffer overflow recognition in debug builds in `os::malloc()` and friends. We also have NMT, which uses its own malloc headers but does no overflow checks. We have a third layer of guarding done when using `+CheckJNICalls`, also using headers, but that one only affects external JNI functions allocating C-heap. I may have missed some other places. But a hotspot C-heap allocation can accumulate a lot of headers, like barnacles on a ship.
Disadvantages of the current solution:
- Discounting `+CheckJNICalls` - which only covers a tiny subset of C-heap allocations - we have no way to do C-heap checking in release builds. But there, it is sorely missed. We ship release VMs, and those VMs get used in myriad ways with a lot of faulty third-party native code. I would love to be able to flip a product switch at a customer site and have some basic C-heap checks done, without relegating to external tools or debug c-libs.
- The debug-only guards in os::malloc() are quite fat, really, a whopping 48 bytes per allocation on 64-bit, 40 bytes on 32-bit. That is for guarding alone. They distort the memory allocation picture, since blowing up every allocation this way causes the underlying libc to do different things. Therefore we have different memory layouts and allocation patterns between debug and release. In addition, we have different code paths too, e.g. in debug os::realloc calls os::malloc + os::free whereas in release builds it calls directly into libc ::realloc. All that means that in debug builds we test something different than what we ship in release builds.
- The canary in the headers of the debug-only guards do not directly precede the user portion of the data, so we won't catch negative buffer overflows of only a few bytes.
- The guarding added by CheckJNICalls is unnecessarily expensive too, since it copies the memory around, handing a copy of the guarded memory up to the caller.
- The fact that three different code sections all do malloc headers incurs unnecessary costs, and the code is unnecessarily complex. It makes also statistics difficult to understand since the silent overhead can be large (compare the rise in RSS with the rise in NMT allocations in a debug build).
- None of the current overflow checkers print out hex dumps of the violated memory. That is what the libc usually does and it is very useful.
Proposal for a better solution:
- Let NMT do buffer overrun checks:
- NMT is in a perfect position to do so, we are almost there already. It is very well thought out and its malloc header implementation is tight and frugal. At least on 64-bit, we could add overflow canaries to the header without blowing it up, there is still space.
- NMT needs headers to work, so it will access them on each allocation and deallocation. Checking a canary at that point does not add much work.
- Then let us discard the current debug-only solution.
- We would save memory since the current solution is hefty: 48 bytes vs 16 bytes NMT has. If we add footers to NMT, it is 16 + footer, still way less than what we pay now.
- We can throw away coding and make os::malloc and friends simpler
- In debug, we are closer to what we run in release. We can use overflow recognition in both debug and release by simply switching on NMT. We can reproduce the same allocation pattern between debug and release.
- If we miss "always-on buffer overflow recognition" in debug, we can just switch NMT on by default in debug to NMT-level="summary". That does not cost much at all. I believe footprint-wise it would still be cheaper than today's debug-only guards. Also, since
- backported by
-
JDK-8299800 Unify C-heap buffer overrun checks into NMT
-
- Resolved
-