Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8297172

Fix some issues of auto-vectorization of `Long.bitCount/numberOfTrailingZeros/numberOfLeadingZeros()`

XMLWordPrintable

    • b27
    • generic
    • generic

      1. Java API for Long.bitCount/numberOfTrailingZeros/numberOfLeadingZeros returns int type but Vector API for them returns long type. Currently, to support auto-vectorization and vector API at the same time, backend provides two kinds of vector implementation for them: one has int vector type and another one has long vector type, as discussed in https://github.com/openjdk/panama-vector/pull/185#discussion_r836017952.
      We can refine the auto-vectorization of these APIs in superword to unify the vector implementation in the backend, removing extra code.

      2. Also, Long.bitCount can't be vectorized when -XX:MaxVectorSize=16, causing the IR match failure of compiler/vectorization/TestPopCountVectorLong.java on 128-bit sve platform. The task also needs to fix it.

      3. Now, `Long.NumberOfLeadingZeros/NumberOfTrailingZeros()` can be vectorized on sve platforms when `-XX:MaxVectorSize=32` or `-XX:MaxVectorSize=64` , the generated code is not correct, like:
      ```
      LOOP:
        sxtw x13, w12
        add x14, x15, x13, uxtx #3
        add x17, x14, #0x10
        ld1d {z16.d}, p7/z, [x17]
        // Incorrectly use integer rbit/clz insn for long type vector
       *rbit z16.s, p7/m, z16.s
       *clz z16.s, p7/m, z16.s
        add x13, x16, x13, uxtx #2
        str q16, [x13, #16]
        ...
        add w12, w12, #0x20
        cmp w12, w3
        b.lt LOOP
      ```
      4. On x86 avx2 platform, there is an assertion failure when C2 tries to vectorize the loops like:
      ```
      // long[] ia;
      // int[] ic;
          for (int i = 0; i < LENGTH; ++i) {
            ic[i] = Long.numberOfLeadingZeros(ia[i]);
          }
      ```

            fgao Fei Gao
            fgao Fei Gao
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: