-
Enhancement
-
Resolution: Fixed
-
P4
-
17, 18
-
b08
-
generic
-
generic
SHA3 algorithm iteratively performs arithmetic operations on a batch of 25 long values, see SHA3.keccak function: https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/SHA3.java#L258
Manual inlining, manual unrolling, and moving the data from array[25] into local variables speed up code execution up to 2 times.
// see the attached diff
private void keccak() { // ARM64 AMD
keccak_0_default(); // 2764ms 1846ms
// keccak_1_nativeimpl(); // 1701ms 1649ms
// keccak_2_inlined_unrolled(); // 1754ms 1499ms
// keccak_3_inlined_unrolled_localvars(); // 1437ms 1261ms
}
Manual inlining, manual unrolling, and moving the data from array[25] into local variables speed up code execution up to 2 times.
// see the attached diff
private void keccak() { // ARM64 AMD
keccak_0_default(); // 2764ms 1846ms
// keccak_1_nativeimpl(); // 1701ms 1649ms
// keccak_2_inlined_unrolled(); // 1754ms 1499ms
// keccak_3_inlined_unrolled_localvars(); // 1437ms 1261ms
}
- relates to
-
JDK-8275913 C2 does not optimize memory access within a loop
- Open