-
Enhancement
-
Resolution: Unresolved
-
P4
-
repo-panama
In bignum arithmetic, it is common to implement multiplication by splitting the number up into k-bit limbs, and then using a k-to-2k bit multiply operation. Typical values for k are 32 and 64. Many modern implementations of crypto algorithms make use of parallelism by performing several 32-to-64 bit multiply operations in parallel. Some processors have instructions for a 32-to-64 bit multiply, but not for a 64-to-64 bit multiply low. These processors will benefit from an optimization that computes this multiplication without bothering to compute the high part of the result (which is always 0).
- blocks
-
JDK-8219878 Vectorized Poly1305 benchmark
-
- Resolved
-
- relates to
-
JDK-8341137 Optimize long vector multiplication using x86 VPMUL[U]DQ instruction
-
- Resolved
-