In cases where auto-vectorization applies to loops with min/max comparisons, the corresponding SIMD instructions (VMINMAX[PH,PS,PD]) are used and provide significant performance boosts from increased parallelization. However, the scalar instruction variants (VMINMAX[SH,SS,SD]) are currently applied when auto-vectorization isn't possible. The primary example of this is a min/max reduction loop.
On platforms that don't support the AVX10 ISA, a distinct sequence of instructions is used in place of the scalar min/max instructions for specification compliance. This approach may also be beneficial from a performance perspective for platforms that support AVX10. The key difference is that AVX10 floating comparison instructions (VUCOMX[SS,SD]) should be used instead of the non-AVX10 ones (UCOMI[SS,SD]) in replacement instruction sequence.
On platforms that don't support the AVX10 ISA, a distinct sequence of instructions is used in place of the scalar min/max instructions for specification compliance. This approach may also be beneficial from a performance perspective for platforms that support AVX10. The key difference is that AVX10 floating comparison instructions (VUCOMX[SS,SD]) should be used instead of the non-AVX10 ones (UCOMI[SS,SD]) in replacement instruction sequence.
- relates to
-
JDK-8352675 Support Intel AVX10 converged vector ISA feature detection
-
- Resolved
-
-
JDK-8360116 Add support for AVX10 floating point minmax instruction
-
- Resolved
-
-
JDK-8371955 Support AVX10 floating point comparison instructions
-
- Resolved
-