Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Fixed
Priority: P2
Fix Version/s: 18
Affects Version/s: hs24, hs25
Component/s: hotspot
Labels:
- gc-g1
- gc-g1-remset

Subcomponent:
gc
Resolved In Build:
b03

The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago.

Over time many problems with performance and in particular memory usage have been observed:

* adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc).

* there is a substantial memory overhead for managing the data structures: examples are
    * using separate (hash) tables for the three different types of card containers
    * there is significant unnecessary preallocation of memory for some of the card set containers
    * Containers store redundant information

* inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that.
Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator.

* inability to support additional use cases: over time interesting ideas (e.g. ~~JDK-8058803~~) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these.

* (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container.

* the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC.

The problems are that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant.
For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times.

Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. ~~JDK-8262185~~, ~~JDK-8233919~~, ~~JDK-8213108~~).

This change is effectively a rewrite of the Java heap card based part of a region's remembered set.

This initial fully working change can be roughly described with the following properties:

* use a single ConcurrentHashTable for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements ~~JDK-6949259~~.

* memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management.

* there are now four different container types and one meta-container type. These four actual containers are:
  * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory.
  * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste.
  * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory
  * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory.
  * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in ~~JDK-8048504~~.

* care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind.

In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting that have been extracted earlier. Individiual affected phases are generally shorter.

blocks

JDK-8267830 Investigate G1CardSet ConcurrentHashTable sizing heuristics

Open

JDK-8267831 Improve G1CardSetAllocator sizing heuristics

Open

JDK-8267833 Improve G1CardSetInlinePtr::add()

Resolved

JDK-8267834 Refactor G1CardSetAllocator and BufferNode::Allocator to use a common base class

Resolved

csr for

JDK-8266721 G1: Refactor remembered sets

Closed

duplicates

JDK-8034873 Concurrent collection set freeing

Closed

JDK-8224840 Optimize G1CardTable::mark_region_table()

Closed

JDK-8227665 Clearing collection set candidates takes a significant amount of time

Closed

JDK-8233012 Improve G1 ergonomics for G1RSetRegionEntries(Base)

Closed

relates to

JDK-8266637 CHT: Add insert_and_get method

Resolved

JDK-8077144 Concurrent mark initialization takes too long

Closed

JDK-8269120 Build failure with GCC 6.3.0 after JDK-8017163

Closed

JDK-8280088 NMT: Make mtGCCardSet the subcategory of mtGC

Open

JDK-8048504 G1: Investigate replacing the coarse and fine grained data structures in the remembered sets

Resolved

JDK-8151386 Extract card live data out of G1ConcurrentMark

Resolved

JDK-8145672 Remove dependency of G1FromCardCache to HeapRegionRemSet

Resolved

JDK-8145673 G1RemSetSummary.hpp uses FREE_C_HEAP_ARRAY

Resolved

JDK-8145674 Fix includes and forward declarations in g1Remset files

Resolved

JDK-8145774 Move scrubbing setup code away out of ConcurrentMark

Resolved

JDK-8213108 Improve work distribution during remembered set scan

Resolved

JDK-8213996 Remove one of the SparsePRT entry tables

Resolved

JDK-8213997 Remove G1HRRSUseSparseTable flag

Resolved

JDK-8233919 Incrementally calculate the occupied cards in a heap region remembered set

Resolved

JDK-8266821 G1: Prefetch cards during merge heap roots phase

Resolved

JDK-8273144 Remove unused top level "Sample Collection Set Candidates" logging

Resolved

JDK-8274430 Remove some debug error printing code added in JDK-8017163

Resolved

JDK-8287024 G1: Improve the API boundary between HeapRegionRemSet and G1CardSet

Resolved

JDK-8345397 Remove <cstdio> from g1HeapRegionRemSet.cpp

Resolved

JDK-8151846 Record the number of live cards per region while creating live data

Closed

JDK-8134048 Clear remembered set while shrinking the heap

Closed

JDK-8273941 G1 GC tuning guide updates for JDK18

Resolved

JDK-8242032 G1 region remembered sets may contain non-coarse level PRTs for already coarsened regions

Resolved

JDK-8276540 Howl Full CardSet container iteration marks too many cards

Closed

JDK-8048075 Adding JFR events to track G1 Remembered set size

Open

JDK-6949259 G1: Merge sparse and fine remembered set hash tables

Resolved

JDK-8145667 Move FromCardCache into separate files

Resolved

JDK-8145671 Rename FromCardCache to G1FromCardCache

Resolved

JDK-8153503 Move remset scan iteration claim to remset local data structure

Resolved

JDK-8153507 Improve Card Table Clear Task

Resolved

JDK-8162928 Micro-optimizations in scanning the remembered sets

Resolved

JDK-8180415 Rebuild remembered sets during the concurrent cycle

Resolved

JDK-8262185 G1: Prune collection set candidates early

Resolved

JDK-8269134 Remove sparsePRT.inline.hpp after JDK-8017163

Resolved

JDK-8275056 Virtualize G1CardSet containers over heap region

Resolved

JDK-8273186 Remove leftover comment about sparse remembered set in G1 HeapRegionRemSet

Resolved

JDK-8016505 G1: Revert back to use HeapBaseMinAddress=256m on Solaris x86

Closed

JDK-8229049 JEP 363: Remove the Concurrent Mark Sweep (CMS) Garbage Collector

Closed

JDK-7187490 G1: Limit the amount of remembered set scrubbing

Closed

JDK-8043574 Investigate decreasing the RS scrubbing work in the GC cleanup pause

Closed

JDK-8058803 Allow one remembered set to be used for multiple regions

Closed

JDK-8134048 Clear remembered set while shrinking the heap

Closed

JDK-8153505 Split up G1RemSet::oops_into_collection_set_do into parts

Closed

links to

Commit openjdk/jdk/1692fd2e

Review openjdk/jdk/4116

(4 duplicates, 43 relates to, 2 links to)

Release Note: Obsoleted Product Options -XX:G1RSetRegionEntries and -XX:G1RSetSparseRegionEntries

Closed

Thomas Schatzl

Assignee:: Thomas Schatzl

Reporter:: Bengt Rutisson (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Created:: 2013-06-20 04:39

Updated:: 2024-12-03 06:30

Resolved:: 2021-06-21 03:07

Details

Description

Attachments

Issue Links

Sub-Tasks

Activity

People

Dates