Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8162928

Micro-optimizations in scanning the remembered sets



    • Enhancement
    • Status: Resolved
    • P4
    • Resolution: Fixed
    • 9
    • 10
    • hotspot
    • gc
    • b21


      During recent work the following worthwhile micro-optimizations for scanning remembered sets (or in general, cards) have been found:

      - HeapRegion::oops_on_card_seq_iterate_careful is faster than using HeapRegionDCTOC during scan rs.

      - HeapRegion::oops_on_card_seq_iterate_careful can be sped up by allowing for specialization for the use cases during gc vs. during mutator time by specialization.

      E.g. a lot of extra checks can go away for such a specialization, like the filter_young one, the g1h->is_gc_active(), the card_ptr != NULL, the various checks whether we are scanning into an unparseable point etc.

      - HeapRegion::oops_on_card_seq_iterate_careful() always does at least one unnecessary call to HeapRegion::block_size().
      I.e. the one done while positioning the cursor at the object starting at or spanning into the card in question is not reused in the entry of the iteration loop.

      HeapRegion::block_size() is very expensive in G1.

        - one can aggressively specialize HeapRegion::block_size() for the use case during gc:
          - addr can not be >= top(), dropping the check
          - the repeated calculation of g1h->concurrent_mark()->prevMarkBitMap() is very expensive. Its load should be hoisted out of the oops_on_card_seq_iterate_careful() main loop and passed in from a local variable.
          - further, the information that the object is dead should be returned from block_size() (or a specialized one). After determining block_size(), oops_on_card_iterate() again does an expensive lookup of the prev mark bitmap to check whether the object is dead and looks up the mark bitmap again.

      - need to look at the called methods, if it is appropriate to make them more amenable to inlining (some short, called methods are in cpp files)

      - HeapRegion::block_is_obj() could be aggressively specialized for RS scan too: the first check for whether the given address is in a continues humongous region can be hoisted out of the entire oop iteration loop into oops_on_card_seq_iterate_careful();

      - HeapRegion::is_obj_dead() could be specialized too: e.g. the is_archive check can be hoisted out to top-level (and actually, since archive regions do not contain any references to non-archive regions) is superfluous


        Issue Links



              tschatzl Thomas Schatzl
              tschatzl Thomas Schatzl
              0 Vote for this issue
              3 Start watching this issue