Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8313354

[GenShen] Highly volatile GC cycles, including degenerated GC's, with SPECjbb

XMLWordPrintable

    • gc

      I have recently been trying to use SPECjbb2015 (albeit with non-compliant settings) with a view to measuring comparative performance for a performance enhancement.

      The details of the changes to SPECjbb props was to run it for a fixed period at a fixed IR, so as to rule out any extraneous artifacts from the harness based on the performance. The idea was to look at the number of GC's and any changes in the metrics associated with these GC's, specifically for Generational Shenandoah.

      The change to config/specjbb.props is listed below:

      279d278
      < specjbb.controller.type=PRESET
      287d285
      < specjbb.controller.preset.ir=2000
      295,296d292
      < # 20 minute run
      < specjbb.controller.preset.duration=1200000
      307,309d302
      < # exactly 2 minutes each
      < specjbb.controller.rtcurve.duration.min=120000
      < specjbb.controller.rtcurve.duration.max=120000
      318,320d310
      < #specjbb.controller.settle.time.min=3000
      < #specjbb.controller.settle.time.max=30000
      < #

      I noticed that while non-generational Shenandoah shows rock steady numbers of GC cycles and no degenerate cycles, Generational Shenandoah's cycles vary quite widely run to run, and include a large number of degenerated cycles.

      On the face of it, this seems to imply to me without further closer examination, that GenShen's cycles trigger perhaps a tad too frequently, and perhaps as a result end up degenerating. There may also possibly be a positive feedback cycle at play in the triggering criteria that result in the large and variable number of GC cycles. This is based on a superficial examination of the logs, but needs to be understood and any pathologies addressed.

      The following numbers were obtained on a 40 core machine subjected to specjbb with PRESET at 2K jops, which would typically be considered a moderate load. Note that this setting is quite non-standard and also non-compliant. It should be tahken with the appropriate dose of salt to understand the apparent triggering/instability and not as an indicator of general behavior.

      For non-generational Shenandoah, the numbers were in the following ball park (cycles were 43 or 44; with 1 or 2 abbreviated GCs, 0 degenerated or full):
      results_reference_default_3/composite.out:[1337.413s][info ][gc,stats ] 44 Successful Concurrent GCs
      results_reference_default_3/composite.out:[1337.413s][info ][gc,stats ] 0 Completed Old GCs
      results_reference_default_3/composite.out:[1337.413s][info ][gc,stats ] 0 Degenerated GCs
      results_reference_default_3/composite.out:[1337.413s][info ][gc,stats ] 2 Abbreviated GCs
      results_reference_default_3/composite.out:[1337.413s][info ][gc,stats ] 0 Full GCs

      For generational Shenandoah, the numbers were quite different. The best case of 5 runs gave:
      2k_3_genshen/results_reference_default_5/composite.out:[1338.709s][info ][gc,stats ] 93 Successful Concurrent GCs
      2k_3_genshen/results_reference_default_5/composite.out:[1338.709s][info ][gc,stats ] 2 Completed Old GCs
      2k_3_genshen/results_reference_default_5/composite.out:[1338.709s][info ][gc,stats ] 11 Degenerated GCs
      2k_3_genshen/results_reference_default_5/composite.out:[1338.709s][info ][gc,stats ] 2 Abbreviated GCs
      2k_3_genshen/results_reference_default_5/composite.out:[1338.709s][info ][gc,stats ] 4 Full GC

      The worst case of those 5 runs gave:
      2k_3_genshen/results_reference_default_3/composite.out:[1315.850s][info ][gc,stats ] 1554 Successful Concurrent GCs
      2k_3_genshen/results_reference_default_3/composite.out:[1315.850s][info ][gc,stats ] 16 Completed Old GCs
      2k_3_genshen/results_reference_default_3/composite.out:[1315.850s][info ][gc,stats ] 59 Degenerated GCs
      2k_3_genshen/results_reference_default_3/composite.out:[1315.850s][info ][gc,stats ] 1089 Abbreviated GCs
      2k_3_genshen/results_reference_default_3/composite.out:[1315.850s][info ][gc,stats ] 6 Full GC

      Note, of course from the time-stamps that the total runtime were not that different or longer, indeed the run that had a large number of degenerated also had a lot of abbreviated cycles and finished sooner.

      Still, it makes sense to perhaps understand this phenomenon (at least from the standpoint of energy frugality) and to see if the triggering can be throttled appropriately and made a bit more deterministic.

            ysr Y. Ramakrishna
            ysr Y. Ramakrishna
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: