- Base variable scalar shifts have bad performance implications and should be replaced by their bmi2 counterparts:
+ Bounded operands
+ Multiple uops both in fused and unfused domains
+ May result in flag stall since the operations have unpredictable flag output
- Flag to general-purpose registers operation currently uses cmov, this could be replaced by set, which transforms the sequence:
+ xorl dst, dst
sometest
movl tmp, 0x01
cmovlcc dst, tmp
into:
+ xorl dst, dst
sometest
setbcc dst
This sequence reduces 1uop of the mov and 1 demanded register without any drawback.
(Note: movzx does not work here since move elision requires different registers for input and output)
Some small improvements:
- Add memory variances to `tzcnt` and `lzcnt`
- Add memory variances to `rolx` and `rorx`
- Add rolx rules (note that `rolx dst, imm` is equivalent to `rorx dst, size - imm`)
+ Bounded operands
+ Multiple uops both in fused and unfused domains
+ May result in flag stall since the operations have unpredictable flag output
- Flag to general-purpose registers operation currently uses cmov, this could be replaced by set, which transforms the sequence:
+ xorl dst, dst
sometest
movl tmp, 0x01
cmovlcc dst, tmp
into:
+ xorl dst, dst
sometest
setbcc dst
This sequence reduces 1uop of the mov and 1 demanded register without any drawback.
(Note: movzx does not work here since move elision requires different registers for input and output)
Some small improvements:
- Add memory variances to `tzcnt` and `lzcnt`
- Add memory variances to `rolx` and `rorx`
- Add rolx rules (note that `rolx dst, imm` is equivalent to `rorx dst, size - imm`)
- relates to
-
JDK-8324720 Instruction selection does not respect -XX:-UseBMI2Instructions flag
- Open
-
JDK-8336860 x86: Change integer src operand for CMoveL of 0 and 1 to long
- Resolved