Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8260760 | openjdk8u292 | Andrew Hughes | P4 | Resolved | Fixed | b01 |
This uses SIMD ldp/stp Qx, Qy instructions instead of scalar ldp/stp instructions, thereby loading/storing 32 bytes at a time instead of 16.
It also extends the small copy code to copy 0-96 instead of 0-80 (because 80 is not divisible by 32).
This improves performance on some micro-arches and not on others so I have provided a -XX:+UseSIMDForMemoryOps switch which defaults to false (we could look at enabling this by default for micro-arches where we know SIMD is better).
It also extends the small copy code to copy 0-96 instead of 0-80 (because 80 is not divisible by 32).
This improves performance on some micro-arches and not on others so I have provided a -XX:+UseSIMDForMemoryOps switch which defaults to false (we could look at enabling this by default for micro-arches where we know SIMD is better).
- backported by
-
JDK-8260760 aarch64: optimise array copy using SIMD instructions
- Resolved
- relates to
-
JDK-8257192 Integrate AArch64 JIT port into 8u
- Resolved