-
Enhancement
-
Resolution: Unresolved
-
P4
-
25
-
generic
-
generic
The following cases are not vectorized after the bug fix for JDK-8350835:
Case 1) byte array
public static byte[] aB = new byte[10000];
...
for (int i = 0; i < aF.length; i++) {
aF[i] = Float.float16ToFloat(aB[i]);
}
Case 2) int array
public static int[] aI = new int[10000];
...
for (int i = 0; i < aF.length; i++) {
aF[i] = Float.float16ToFloat((short)aI[i]);
}
Prior toJDK-8350835 fix these were resulting in wrong vectorization/code gen in the product build. The JDK-8350835 fix disabled auto vectorization for these cases.
Let us take Case 1) with byte array:
Scalar code for loop body: LoadB, ConvHF2F, StoreF
Vectorized code: LoadVector, VectorCastHF2F , StoreVector
LoadB loads a byte with sign extension and the input to ConvHF2F is a signed short.
LoadVector loads byte elements into vector register and the input to VectorCastHF2F is a byte vector
But the VectorCastHF2F expects short vector as input always and so this results in wrong code generation.
This could be fixed in two ways:
Either the vectorizer should add vector byte to short conversion before VectorCastHF2F
Or the VectorCastHF2F code gen should handle Byte vector as input in addition to Short vector.
For Case 2) with int array:
Scalar code for loop body: LoadI, LShiftI by 16, RShiftI by 16, ConvHF2F, StoreF
Vectorized code: LoadVector, LShiftVI, RShiftVI, VectorCastHF2F, StoreVector
Here LoadI, LShiftI by 16, RShiftI by 16 are doing the int to short conversion and sign extension.
LoadVector is loading int elements into vector register and the input to VectorCastHF2F is an int vector
But as above the VectorCastHF2F expects short vector as input always and so this results in wrong code generation.
This could be fixed in two ways:
Either the vectorizer should add vector int to short conversion before VectorCastHF2F
Or the VectorCastHF2F code gen should handle int vector as input in addition to Short vector.
Case 1) byte array
public static byte[] aB = new byte[10000];
...
for (int i = 0; i < aF.length; i++) {
aF[i] = Float.float16ToFloat(aB[i]);
}
Case 2) int array
public static int[] aI = new int[10000];
...
for (int i = 0; i < aF.length; i++) {
aF[i] = Float.float16ToFloat((short)aI[i]);
}
Prior to
Let us take Case 1) with byte array:
Scalar code for loop body: LoadB, ConvHF2F, StoreF
Vectorized code: LoadVector, VectorCastHF2F , StoreVector
LoadB loads a byte with sign extension and the input to ConvHF2F is a signed short.
LoadVector loads byte elements into vector register and the input to VectorCastHF2F is a byte vector
But the VectorCastHF2F expects short vector as input always and so this results in wrong code generation.
This could be fixed in two ways:
Either the vectorizer should add vector byte to short conversion before VectorCastHF2F
Or the VectorCastHF2F code gen should handle Byte vector as input in addition to Short vector.
For Case 2) with int array:
Scalar code for loop body: LoadI, LShiftI by 16, RShiftI by 16, ConvHF2F, StoreF
Vectorized code: LoadVector, LShiftVI, RShiftVI, VectorCastHF2F, StoreVector
Here LoadI, LShiftI by 16, RShiftI by 16 are doing the int to short conversion and sign extension.
LoadVector is loading int elements into vector register and the input to VectorCastHF2F is an int vector
But as above the VectorCastHF2F expects short vector as input always and so this results in wrong code generation.
This could be fixed in two ways:
Either the vectorizer should add vector int to short conversion before VectorCastHF2F
Or the VectorCastHF2F code gen should handle int vector as input in addition to Short vector.
- relates to
-
JDK-8350835 C2 SuperWord: assert/wrong result when using Float.float16ToFloat with byte instead of short input
-
- Resolved
-