> Note that any reference to pages from here on out refers to the concept of a heap region in ZGC, not pages in the operating system (OS), unless stated otherwise.
# Background
We want to addresses fragmentation by introducing a Mapped Cache that replaces the Page Cache in ZGC. The largest limitation of the Page Cache is that it is constrained by the abstraction of what a page is. The proposed Mapped Cache removes this limitation by decoupling memory from pages, allowing it to merge and split memory in ways that the Page Cache is not suited for. To facilitate the transition, much of the Page Allocator needs to be redesigned to work with the Mapped Cache.
In addition to fighting fragmentation, the new approach improves NUMA-support and simplifies memory unampping. Combined, these changes lay the foundation for even more improvements in ZGC, like replacing multi-mapped memory with anonymous memory.
# Why a Mapped Cache?
The main benefit of the Mapped Cache is that adjacent virtual memory ranges in the cache can be merged to create larger ranges, enabling larger allocation requests to succeed more easily. Most notably, it allows allocations to succeed more often without "harvesting" smaller, discontiguous ranges. Harvesting negatively impacts both fragmentation and latency, as it requires remapping memory into a new contiguous virtual address range. Fragmentation becomes especially problematic in long-running programs and in environments with limited address space, where finding large contiguous regions can be difficult and may lead to premature Out Of Memory Errors (OOME).
The Mapped Cache uses a self-balancing binary search tree to store memory ranges. Since the ranges are unused when inside the cache, the tree can use this memory to store metadata about itself, referred to as intrusive storage. This approach eliminates the need for dynamic memory allocation (e.g., malloc), which could otherwise introduce a latency overhead.
# Fragmentation
Currently, ZGC has multiple strategies for dealing with fragmentation. In some edge cases, these strategies are not as efficient as we would like. By addressing fragmentation differently with the Mapped Cache, ZGC is in a better position to avoid edge cases, which are bad even if they occur only once. This is especially impactful for programs running with a large heap.
## Virtual Memory Shuffling
In addition to the Mapped Cache, we propose making some adjustments in how ZGC deals with virtual memory. When harvesting memory, which needs to be remapped, new contiguous virtual memory must first be claimed. We propose a new feature in which the harvested memory can be re-used to improve the likelihood of finding a contiguous range. Additionally, the defragmentation policy should be re-designed so that Large pages are always defragmented upon being freed. When freed, they are broken down and remapped into lower address space, in the hopes of "filling holes" and creating more contiguous ranges.
# NUMA and Partitions
In the current policy, ZGC interleaves memory across all NUMA nodes with a granularity of ZGranuleSize (2MB), which is the same size as a Small page. As a result, Small pages will end up on a single, preferably local, NUMA node, whilst larger allocations will (likely) end up on multiple NUMA nodes. In the new design, the policy is to prefer allocating *all* allocation sizes to the local NUMA node whenever possible. As an effect, ZGC may be able to extract better performance from NUMA systems.
To support local NUMA allocations, the Page Allocator, and in turn the Java heap, is split up into what we refer to as Partitions. A partition keeps track of its own heap size and Mapped Cache, allowing it to only handle memory that is associated with its own share of the heap. The number of partitions is currently the same as the number of NUMA nodes. On non-NUMA systems, only a single partition is kept track of.
The introduction of partitions also establishes a foundation for more fine-grained control over the heap, paving the way for future enhancements, both NUMA possibilities and new features, such as Thread-Local GC.
# Defragmentation (Unmapping Memory)
Up until now, ZGC has unmapped memory asynchronously in a separate thread. The benefit of this is that other threads do not need to take a latency hit when unmapping memory. The main dependency on asynchronous unmapping is when harvesting, especially from a mutator thread, where synchronous unmapping could lead to unwanted latency.
With the introduction of the Mapped Cache, and by moving defragmentation away from mutator threads to the GC, asynchronous unmapping is no longer necessary to meet our latency goals. Instead, memory should be unmapped synchronously. The number of times memory is defragmented for page allocations has been reduced significantly. However, memory for Small pages never needs to be defragmented at all. For Large pages, memory defragmentation has little effect on the total latency, as they are costly to allocate anyways. For Medium pages, we have plans for future enhancements where memory is defragmented even less, or not at all.
For clarity: with the removal of asynchronous unmapping, the ZUnmapper thread and ZUnmap JFR event should be removed.
# Multi-Mapped Memory
Asynchronous unmapping has so far been possible because ZGC is backed by shared memory (on Linux), which allows memory to be multi-mapped. This is an artifact from non-generational ZGC, which used multi-mapping in its core design (See [this](https://wiki.openjdk.org/display/zgc/Pointer+Metadata+using+Multi-Mapped+memory) resource for more info). A goal we have in ZGC is to move from shared memory to anonymous memory. There are multiple benefits with anonymous memory, one of them being easier configuration for Transparent Huge Pages (OS pages). Anonymous memory doesn't support multi-mapped memory, and would be blocked by the asynchronous unmapping feature. However, with the removal of asynchronous unmapping, we are better prepared for transitioning to anonymous memory.
# Additional Notes
The Mapped Cache will initially use a custom red-black tree implementation. Another red-black tree was recently introduced by C. Norrbin inJDK-8345314 (and enhanced in JDK-8349211). Our goal is to initially use the custom implementation, but remove it in favor of Norrbin's tree in the future. The reason we have our own tree implementation is because Norrbin's tree was not finished during the time we were developing and testing this feature.
Some new additions have been made to keep the current functionality in the Serviceability Agent (SA).
# Background
We want to addresses fragmentation by introducing a Mapped Cache that replaces the Page Cache in ZGC. The largest limitation of the Page Cache is that it is constrained by the abstraction of what a page is. The proposed Mapped Cache removes this limitation by decoupling memory from pages, allowing it to merge and split memory in ways that the Page Cache is not suited for. To facilitate the transition, much of the Page Allocator needs to be redesigned to work with the Mapped Cache.
In addition to fighting fragmentation, the new approach improves NUMA-support and simplifies memory unampping. Combined, these changes lay the foundation for even more improvements in ZGC, like replacing multi-mapped memory with anonymous memory.
# Why a Mapped Cache?
The main benefit of the Mapped Cache is that adjacent virtual memory ranges in the cache can be merged to create larger ranges, enabling larger allocation requests to succeed more easily. Most notably, it allows allocations to succeed more often without "harvesting" smaller, discontiguous ranges. Harvesting negatively impacts both fragmentation and latency, as it requires remapping memory into a new contiguous virtual address range. Fragmentation becomes especially problematic in long-running programs and in environments with limited address space, where finding large contiguous regions can be difficult and may lead to premature Out Of Memory Errors (OOME).
The Mapped Cache uses a self-balancing binary search tree to store memory ranges. Since the ranges are unused when inside the cache, the tree can use this memory to store metadata about itself, referred to as intrusive storage. This approach eliminates the need for dynamic memory allocation (e.g., malloc), which could otherwise introduce a latency overhead.
# Fragmentation
Currently, ZGC has multiple strategies for dealing with fragmentation. In some edge cases, these strategies are not as efficient as we would like. By addressing fragmentation differently with the Mapped Cache, ZGC is in a better position to avoid edge cases, which are bad even if they occur only once. This is especially impactful for programs running with a large heap.
## Virtual Memory Shuffling
In addition to the Mapped Cache, we propose making some adjustments in how ZGC deals with virtual memory. When harvesting memory, which needs to be remapped, new contiguous virtual memory must first be claimed. We propose a new feature in which the harvested memory can be re-used to improve the likelihood of finding a contiguous range. Additionally, the defragmentation policy should be re-designed so that Large pages are always defragmented upon being freed. When freed, they are broken down and remapped into lower address space, in the hopes of "filling holes" and creating more contiguous ranges.
# NUMA and Partitions
In the current policy, ZGC interleaves memory across all NUMA nodes with a granularity of ZGranuleSize (2MB), which is the same size as a Small page. As a result, Small pages will end up on a single, preferably local, NUMA node, whilst larger allocations will (likely) end up on multiple NUMA nodes. In the new design, the policy is to prefer allocating *all* allocation sizes to the local NUMA node whenever possible. As an effect, ZGC may be able to extract better performance from NUMA systems.
To support local NUMA allocations, the Page Allocator, and in turn the Java heap, is split up into what we refer to as Partitions. A partition keeps track of its own heap size and Mapped Cache, allowing it to only handle memory that is associated with its own share of the heap. The number of partitions is currently the same as the number of NUMA nodes. On non-NUMA systems, only a single partition is kept track of.
The introduction of partitions also establishes a foundation for more fine-grained control over the heap, paving the way for future enhancements, both NUMA possibilities and new features, such as Thread-Local GC.
# Defragmentation (Unmapping Memory)
Up until now, ZGC has unmapped memory asynchronously in a separate thread. The benefit of this is that other threads do not need to take a latency hit when unmapping memory. The main dependency on asynchronous unmapping is when harvesting, especially from a mutator thread, where synchronous unmapping could lead to unwanted latency.
With the introduction of the Mapped Cache, and by moving defragmentation away from mutator threads to the GC, asynchronous unmapping is no longer necessary to meet our latency goals. Instead, memory should be unmapped synchronously. The number of times memory is defragmented for page allocations has been reduced significantly. However, memory for Small pages never needs to be defragmented at all. For Large pages, memory defragmentation has little effect on the total latency, as they are costly to allocate anyways. For Medium pages, we have plans for future enhancements where memory is defragmented even less, or not at all.
For clarity: with the removal of asynchronous unmapping, the ZUnmapper thread and ZUnmap JFR event should be removed.
# Multi-Mapped Memory
Asynchronous unmapping has so far been possible because ZGC is backed by shared memory (on Linux), which allows memory to be multi-mapped. This is an artifact from non-generational ZGC, which used multi-mapping in its core design (See [this](https://wiki.openjdk.org/display/zgc/Pointer+Metadata+using+Multi-Mapped+memory) resource for more info). A goal we have in ZGC is to move from shared memory to anonymous memory. There are multiple benefits with anonymous memory, one of them being easier configuration for Transparent Huge Pages (OS pages). Anonymous memory doesn't support multi-mapped memory, and would be blocked by the asynchronous unmapping feature. However, with the removal of asynchronous unmapping, we are better prepared for transitioning to anonymous memory.
# Additional Notes
The Mapped Cache will initially use a custom red-black tree implementation. Another red-black tree was recently introduced by C. Norrbin in
Some new additions have been made to keep the current functionality in the Serviceability Agent (SA).
- relates to
-
JDK-8354310 JFR: Inconsistent metadata in ZPageAllocation
-
- Resolved
-
- links to
-
Commit(master) openjdk/jdk/7e69b98e
-
Review(master) openjdk/jdk/24547