[VectorAPI]: AArch64: Prefer merging mode SVE CPY instruction

XMLWordPrintable

    • Type: Enhancement
    • Resolution: Unresolved
    • Priority: P4
    • tbd
    • Affects Version/s: 27
    • Component/s: hotspot
    • aarch64
    • generic

      On Neoverse-V1/V2, the SVE `CPY (immediate, merging)` instruction performs better than the SVE `CPY (immediate, zeroing) instruction. Optimizing `CPY(immediate, zeroing)` as `MOVI + CPY(immediate, merging) gets performance uplift of **12%** to **100%** in specific Java Vector API micro-benchmarks depending on the specific operation and data types involved.

      Currently the SVE `CPY (immediate, zeroing) instruction is used in code generated by `VectorStoreMaskNode` and `VectorReinterpretNode`. Doing this optimization benefits all Vector APIs that generates these two IRs, such as `VectorMask.intoArray()` and `Vector.toLong()`.

            Assignee:
            Eric Fang
            Reporter:
            Eric Fang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: