Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8277850

C2: optimize mask checks in counted loops

    XMLWordPrintable

Details

    • Enhancement
    • Resolution: Fixed
    • P4
    • 18
    • 17, 18
    • hotspot
    • b28

    Backports

      Description

        The memory access API supports custom alignment constraints, which are checked upon memory access, using the following formula:

        ((segmentBaseAddress + accessedOffset) & alignmentMask) == 0

        However, when accessing a segment using a var handle obtained from a layout featuring a non-trivial alignment mask, access performance is slower than in the case where the alignment mask is 0.

        The attached patch adds a benchmark which shows the problem; the benchmark compares accessing a segment using a 4-byte aligned vs. a 1-byte aligned layout:

        ```
        Benchmark Mode Cnt Score Error Units
        LoopOverNonConstant.segment_loop_instance_index avgt 30 0.229 ? 0.001 ms/op
        LoopOverNonConstant.segment_loop_instance_index_aligned avgt 30 0.329 ? 0.005 ms/op
        ```

        As it can be seen, access with alignment constraints is slower.

        This is mildly surprising - after all, in the above formula, segmentBaseAddress is a loop invariant - whereas accessedOffset typically depends on the loop variable, so existing BCE logic should kick in and detect that the offset is always aligned (given the loop stride).

        Attachments

          Issue Links

            Activity

              People

                roland Roland Westrelin
                mcimadamore Maurizio Cimadamore
                Votes:
                1 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: