-
Enhancement
-
Resolution: Unresolved
-
P4
-
15
-
x86_64
- Current Implementation of vector logic not operation performs a Xor operation between the input vector and Broadcasted -1 value which is read from an externally initialized memory.
- The broadcast operation can be made efficient by replacing a read from external memory (which may cause a cache miss) over non-AVX3 targets.
- Over AVX3 a single ternary logic instruction is sufficient to replace complete pattern involving Xor and broadcast operation.
- The broadcast operation can be made efficient by replacing a read from external memory (which may cause a cache miss) over non-AVX3 targets.
- Over AVX3 a single ternary logic instruction is sufficient to replace complete pattern involving Xor and broadcast operation.
- relates to
-
JDK-8223198 Enhance auto vectorization for aarch64
- Open
-
JDK-8241040 Support for AVX-512 Ternary Logic Instruction
- Resolved