The allTrue and anyTrue operations are implemented using ptest/vptest instruction. Two optimizations are possible:
1) The ptest instruction minimum size is 128 bit.
Smaller < 128 bit size operations can be implemented by first broadcasting (duplicating) the input to 128 bits.
The two inputs to these operations are:
a) Vector mask being tested
b) All ones
For allTrue operation, both the inputs need to be broadcasted.
For anyTrue operation, only the first input (vector mask) need to be broadcasted.
2) The anyTrue operation followed by comparison with zero can use the zero flag generated by ptest/vptest directly.
1) The ptest instruction minimum size is 128 bit.
Smaller < 128 bit size operations can be implemented by first broadcasting (duplicating) the input to 128 bits.
The two inputs to these operations are:
a) Vector mask being tested
b) All ones
For allTrue operation, both the inputs need to be broadcasted.
For anyTrue operation, only the first input (vector mask) need to be broadcasted.
2) The anyTrue operation followed by comparison with zero can use the zero flag generated by ptest/vptest directly.