Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8337709

Use allocated states for chunking large array processing

    XMLWordPrintable

Details

    • Enhancement
    • Resolution: Fixed
    • P4
    • 24
    • None
    • hotspot
    • None
    • gc
    • b11

    Description

      We currently have several mechanisms used by different GCs (or even within the same GC) for breaking up the processing of large arrays into chunks that can be dealt with in parallel. This feature is used in conjunction with task stealing to allow several worker threads to each work on a different chunk of a large array. This can help balance work, especially if there are very large arrays for which processing might happen to only start late in a phase.

      G1 young/mixed collections and ParallelGC young collections both use PartialArrayScanTasks. These encode the remaining work in the length field of either the from-space array (ParallelGC) or the to-space array (G1). But this doesn't work for pinned array objects, so G1 can't use it for arrays that have failed evacuation. This approach also can't be used by concurrent GCs.

      Shenandoah encodes the work state in the high address bits of the array oop in the task. That places limits on the available address space that might not even be valid now (see Linux Large Virtual address space, for example).

      G1 concurrent marking and ZGC use segregated task queues for array chunks. This allows normal queues and array chunk queues to have different element sizes, allowing the array chunk queues to carry additional information. However, this complicates queue processing, work stealing, and termination detection.

      JDK-8332455 proposed using the Shenandoah encoding more widely, with a fallback to fatter tasks (with unused fields in the common oop/narrowOop tasks) if the address space limitation was a problem (which is always, for 32bit platforms). That proposal received some pushback, and ended up being abandoned because the author found they didn't need this to solve a different problem they were working on.

      During the review of JDK-8332455, a different approach was proposed. The proposal is to allocate a state object, whose address is the same size as other values in the task queue (so avoids either segregated queues or larger queue elements), which carries the needed task information. Arena+free-list allocation is used to reduce the cost of such an approach.

      This approach avoids the use-case limits of PartialArrayScanTasks, avoids the additional complexities of segregated queues, and avoids the address space limitation of the Shenandoah encoding. However, it has costs for allocation and management of state objects. The impact of those costs needs to be minimized and the result compared to existing mechanisms.

      Attachments

        Issue Links

          Activity

            People

              kbarrett Kim Barrett
              kbarrett Kim Barrett
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: