Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8359683

ZGC: NUMA-Aware Relocation

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Fixed
    • Icon: P4 P4
    • 26
    • None
    • hotspot
    • gc
    • b13

      With JDK-8350441, ZGC got infrastructure to prefer allocations to end up on a specific NUMA node. When a new object is allocated, it is preferably placed on the NUMA node that is local to the allocating thread. This strategy improves acess speeds for mutators working on that object, if it continues to be used by threads on the same NUMA node. However, when relocating objects, ZGC will potentially move (migrate) objects away from the NUMA node they were originally allocated on. This means that if a page is selected as part of the Relocation Set, the objects on that page could potentially be moved to another NUMA node, breaking the NUMA locality we strived for when allocating.

      As a follow up, we should consider adding NUMA-awareness to ZGC's relocation phase. NUMA-Awareness consists of two main features:

      First: GC threads should strive toward keeping the NUMA locality of objects to their original node, meaning that objects should ideally be relocated to a page that is on the same NUMA node.

      Mutator threads should have a different approach, as we know that the mutator that's (helping out with) relocating an object is also going to access it, so we migrate the object to the NUMA node associated with the relocating thread. This strategy is already in effect and does not require any changes to the code (specifically, ZObjectAllocator already track per-CPU specific Small pages). However, Medium pages are shared between CPUs and thus does not hold any guarantees on which NUMA node it is on. Combined, both mutator and Medium page relocation are not common, and thus there is little gain from introducing NUMA-awareness to that specific scenario. Instead, this can be addressed in a follow-up if we feel that's necessary.

      Secondly: when the GC chooses a page from the Relocation Set to relocate objects from, it should choose page(s) that are local to the same NUMA node, to speed up performance by working on NUMA-local memory. There are multiple ways to achieve this, but the main goal should be to (1) start working on pages that are local to the GC thread's NUMA node, and (2) when finished with pages on its own NUMA node, start working (help out) with pages associated with other NUMA nodes.

      Some key observations to consider with the above approach:

      * The NUMA node associated with the GC thread should be "polled"/"checked" in regular intervals, to account for the fact that the GC thread might have migrated to another CPU, and in turn another NUMA node. It is probably enough to check the associated NUMA node before claiming a new page and starting to relocate objects.

      * By choosing pages based on NUMA-node association rather than live bytes, we might not start with the most sparse page first. This is really only a problem if the machine is fully saturated and there are allocation stalls. Additionally, it is worth considering that in a common NUMA configuration, it takes twice as long to access remote memory compared to local memory. This means that a local page could (theoretically) be relocated twice as fast as a remote page, which could release memory faster than starting with the most sparse page, if that page is on a remote node.

      * The new strategy is (partly) an optimization for mutators and might make the GC take a bit longer to complete the relocation phase. The current strategy is to move objects to the NUMA node associated with the GC thread, regardless of where the object was originally from. This makes the GC fast, at the potential downside of mutators not accessing local memory any more. However, since ZGC is a concurrent garbage collector, it isn't really a huge issue if the relocation phase becomes a bit longer, if the mutators receive a speedup.

            jsikstro Joel Sikström
            jsikstro Joel Sikström
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: