Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8338126

C2 SuperWord: VectorCastF2HF / vcvtps2ph produces wrong results for vector length 2

XMLWordPrintable

    • b21
    • x86_64

      It seems that vcvtps2ph only is implemented for vector length 4, 8, 16 on x64. Not sure about aarch64 or other platforms.
      https://www.felixcloutier.com/x86/vcvtps2ph

      But in the example below, we see that we still vectorize a 2-element vector with Float.floatToFloat16 / VectorCastF2HF / vcvtps2ph. It looks like it just generates a 4-element vcvtps2ph, which then stores 8 bytes instead of the desired 4 bytes. The 4 lower bytes have the correct values, but the upper 4 bytes are all zero. The vcvtps2ph operation stores directly to memory, meaning it overwrites 4 bytes with zero - this produces the wrong results.

      Reproduces the bug:
      java -Xint Test2b.java

      But not in interpreter or without SuperWord:
      java -Xint Test2b.java
      java -XX:-UseSuperWord Test2b.java

      More info in this run:
      java -XX:CompileCommand=printcompilation,Test2b::test -XX:CompileCommand=compileonly,Test2b::test -Xbatch -XX:+TraceNewVectors Test2b.java

      Result:
      Exception in thread "main" java.lang.RuntimeException: errors: 480
      at Test2b.main(Test2b.java:30)


      I can reproduce these wrong results in these versions: JDK24-JDK21.

      It looks like a regression of JDK-8289552, which was introduced in JDK20.
      https://github.com/openjdk/jdk/commit/07946aa49c97c93bd11675a9b0b90d07c83f2a94
      https://git.openjdk.org/jdk/pull/9781


      You have to assess if this only applies to x64, or also to aarch64 or even risc_v. They all implement VectorCastF2HF.

            sviswanathan Sandhya Viswanathan
            epeter Emanuel Peter
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: