-
Enhancement
-
Resolution: Fixed
-
P4
-
17, 21, 23, 24
-
b20
-
aarch64
When looking at generated assembly for some intrinsics, I noticed that we have this sequence:
ldr w11, [x25]
lsr x0, x11, #0
I believe this is compressed oops decoding.
The trailing lsr looks redundant, and could be emitted as just the mov x0, x11. This would potentially set this code up for using zero-latency movs. In some cases, when dest and src regs are the same, the encoding for mov can be skipped altogether.
There is a more generic thing we can make in AArch64 MacroAssembler to rewrite these bit manipulation instructions to mov or nops where possible (JDK-8341895). This fix is surgical and thus more easily backportable.
ldr w11, [x25]
lsr x0, x11, #0
I believe this is compressed oops decoding.
The trailing lsr looks redundant, and could be emitted as just the mov x0, x11. This would potentially set this code up for using zero-latency movs. In some cases, when dest and src regs are the same, the encoding for mov can be skipped altogether.
There is a more generic thing we can make in AArch64 MacroAssembler to rewrite these bit manipulation instructions to mov or nops where possible (
- relates to
-
JDK-8341895 AArch64: Optimize MacroAssembler for identity bit manipulations
- Closed
- links to
-
Commit(master) openjdk/jdk/e3f65039
-
Review(master) openjdk/jdk/21443