Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8265263

AArch64: Combine vneg with right shift count

    XMLWordPrintable

Details

    • Enhancement
    • Resolution: Fixed
    • P4
    • 19
    • 17
    • hotspot
    • b13
    • aarch64
    • generic

    Description

      AArch64 dose *not* have right shift SIMD instruction. Given by this, an extra "vneg" is needed before each left shift to achieve the right one.

      By Combing the "vneg" with RShiftCntV, those extra "vneg" could be saved.

      Before:
        0x0000ffffa9106c68: ldr q17, [x15, #16]
        0x0000ffffa9106c6c: add x14, x10, x14
        0x0000ffffa9106c70: neg v18.16b, v16.16b
        0x0000ffffa9106c74: ushl v17.8h, v17.8h, v18.8h
        0x0000ffffa9106c78: str q17, [x14, #16]
        0x0000ffffa9106c7c: ldr q17, [x15, #32]
        0x0000ffffa9106c80: neg v18.16b, v16.16b
        0x0000ffffa9106c84: ushl v17.8h, v17.8h, v18.8h
        0x0000ffffa9106c88: str q17, [x14, #32]
        0x0000ffffa9106c8c: ldr q17, [x15, #48]
        0x0000ffffa9106c90: neg v18.16b, v16.16b
        0x0000ffffa9106c94: ushl v17.8h, v17.8h, v18.8h
        0x0000ffffa9106c98: str q17, [x14, #48]
        0x0000ffffa9106c9c: ldr q17, [x15, #64]
        0x0000ffffa9106ca0: neg v18.16b, v16.16b
        0x0000ffffa9106ca4: ushl v17.8h, v17.8h, v18.8h
        0x0000ffffa9106ca8: str q17, [x14, #64]
        0x0000ffffa9106cac: ldr q17, [x15, #80]
        0x0000ffffa9106cb0: neg v18.16b, v16.16b
        0x0000ffffa9106cb4: ushl v17.8h, v17.8h, v18.8h

      After:
        0x0000ffff81106af8: ldr q17, [x15, #16]
        0x0000ffff81106afc: ushl v17.8h, v17.8h, v16.8h
        0x0000ffff81106b00: add x14, x10, x14
        0x0000ffff81106b04: str q17, [x14, #16]
        0x0000ffff81106b08: ldr q17, [x15, #32]
        0x0000ffff81106b0c: ushl v17.8h, v17.8h, v16.8h
        0x0000ffff81106b10: str q17, [x14, #32]
        0x0000ffff81106b14: ldr q17, [x15, #48]
        0x0000ffff81106b18: ushl v17.8h, v17.8h, v16.8h
        0x0000ffff81106b1c: str q17, [x14, #48]
        0x0000ffff81106b20: ldr q17, [x15, #64]
        0x0000ffff81106b24: ushl v17.8h, v17.8h, v16.8h
        0x0000ffff81106b28: str q17, [x14, #64]
        0x0000ffff81106b2c: ldr q17, [x15, #80]
        0x0000ffff81106b30: ushl v17.8h, v17.8h, v16.8h
        0x0000ffff81106b34: str q17, [x14, #80]
        0x0000ffff81106b38: ldr q17, [x15, #96]
        0x0000ffff81106b3c: ushl v17.8h, v17.8h, v16.8h
        0x0000ffff81106b40: str q17, [x14, #96]
        0x0000ffff81106b44: ldr q17, [x15, #112]
        0x0000ffff81106b48: ushl v17.8h, v17.8h, v16.8h
        0x0000ffff81106b4c: str q17, [x14, #112]
        0x0000ffff81106b50: ldr q17, [x15, #128]
        0x0000ffff81106b54: ushl v17.8h, v17.8h, v16.8h
        0x0000ffff81106b58: str q17, [x14, #128]

      AArch32 benefits from this way.

      Attachments

        Issue Links

          Activity

            People

              haosun Hao Sun (Inactive)
              eliu Eric Liu (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: