Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8277850

C2: optimize mask checks in counted loops

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Fixed
    • Icon: P4 P4
    • 18
    • 17, 18
    • hotspot
    • b28

        The memory access API supports custom alignment constraints, which are checked upon memory access, using the following formula:

        ((segmentBaseAddress + accessedOffset) & alignmentMask) == 0

        However, when accessing a segment using a var handle obtained from a layout featuring a non-trivial alignment mask, access performance is slower than in the case where the alignment mask is 0.

        The attached patch adds a benchmark which shows the problem; the benchmark compares accessing a segment using a 4-byte aligned vs. a 1-byte aligned layout:

        ```
        Benchmark Mode Cnt Score Error Units
        LoopOverNonConstant.segment_loop_instance_index avgt 30 0.229 ? 0.001 ms/op
        LoopOverNonConstant.segment_loop_instance_index_aligned avgt 30 0.329 ? 0.005 ms/op
        ```

        As it can be seen, access with alignment constraints is slower.

        This is mildly surprising - after all, in the above formula, segmentBaseAddress is a loop invariant - whereas accessedOffset typically depends on the loop variable, so existing BCE logic should kick in and detect that the offset is always aligned (given the loop stride).

              roland Roland Westrelin
              mcimadamore Maurizio Cimadamore
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: