Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8030849

Investigate high fragmentation/waste in some situations during allocation during GC in G1

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Unresolved
    • Icon: P3 P3
    • tbd
    • 9
    • hotspot
    • gc

      G1 sometimes cannot prevent full GCs even with timely marking and mixed GCs. These occur intermittently with no immediately apparent cause. These full GCs are extremely bad for response times, as they (and the mixed GCs with to-space-exhaustions) are very slow.

      Leading up to these GCs, as mentioned, there are typically many mixed GCs with their respective marking cycles, but these mixed GCs do not manage to reclaim anything, actually growing the heap.

      There is the CRM Fuse benchmark where we particularly increased the heap region size (to 4M) to avoid very frequent large objects which G1 otherwise does not handle well at all at the moment.

      Some initial investigation with -XX:+ParallelGCVerbose indicates that waste due to PLABs is already very high in the regular case (in total 8M for about 35-40M of copying into survivor or old gen), but extremely high in these cases when mixed GCs increases the heap size (>24M).

      The current suspicion is that the allocators for the evacuation are not able to avoid waste; the PLAB resizing of G1 also likely contributes to that (PLABs may basically be additional large objects), but the main factor should be these frequent large objects (up to 2M) in the young gen, that cause G1 to retire the current allocation regions.

      First, this situation must be investigated further, and the assumptions verified and some object size and fragmentation/waste information obtained; then some idea developed to mitigate this issue. Some possible solutions could be:
      - limiting maximum PLAB size to decrease waste due to too large PLABs (if this is a problem)
      - if the problem is the waste at the end of a region because of largish objects coming in: improve the allocator to not retire a region, but keep the tails available for further PLAB allocation, and just allocate the large object somewhere else
      - analyze the current code that attempts to mitigate this issue (two allocation LABs per thread), maybe extending it
      - something else

      This is the umbrella issue resulting in further work/issues.

        1. z2.log.bz2
          2.50 MB
        2. gcmicro.tar.gz
          31 kB
        3. gcmicro.tar.gz
          31 kB

            Unassigned Unassigned
            tschatzl Thomas Schatzl
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: