C2 SuperWord: disable automatic alignment for small iteration counts

XMLWordPrintable

    • Type: Enhancement
    • Resolution: Unresolved
    • Priority: P4
    • tbd
    • Affects Version/s: 27
    • Component/s: hotspot

      This is another approach to solve JDK-8344085.

      I did some benchmarking recently, see comments:
      https://github.com/openjdk/jdk/pull/22629#issuecomment-3811234712

      I can see that small iteration counts probably do not profit from automatic alignment. There is a trade-off here:

      Alignment means spending more iterations in the scalar pre-loop, and it can happen that more pre-loop iterations means we do fewer iterations in main/drain loop. That has a performance penalty, especially noticable for small iteration loops where a few single-iterations make a big contribution to runtime.

      Misalignment means we have split memory accesses in the main/drain loop. That has a penalty that could cut the speedups of vectorization in half, just as the vectors are split into two.
      If we only have few main/drain loop iterations this is not so
      noticable, but if there are many main/drain loop iterations,
      this really starts to show.

      There must be a cut-off point:
      - below we should not align, the extra pre-loop iterations are too expensive
      - above we should align, split accesses in the main-loop are too expensive

            Assignee:
            Emanuel Peter
            Reporter:
            Emanuel Peter
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: