A DESCRIPTION OF THE PROBLEM :
BigInteger currently uses three different algorithms for multiply. The simple quadratic algorithm, then the slightly better Karatsuba if we exceed a bit count and then Toom Cook 3 once we go into the several thousands of bits. Since Toom Cook 3 is a recursive algorithm, it is trivial to parallelize it. I have demonstrated this several times in conference talks. In order to be consistent with other classes such as Arrays and Collection, I have added a parallelMultiply() method. Internally we have added a parameter to the private multiply method to indicate whether the calculation should be done in parallel.
The performance improvements are as should be expected. Fibonacci of 100 million (using a single-threaded Dijkstra's sum of squares version) completes in 9.2 seconds with the parallelMultiply() vs 25.3 seconds with the sequential multiply() method. This is on my 1-8-2 laptop. The final multiplications are with very large numbers, which then benefit from the parallelization of Toom-Cook 3. Fibonacci 100 million is a 347084 bit number.
We have also parallelized the private square() method. Internally, the square() method defaults to be sequential.
```
Benchmark (n) Mode Cnt Score Error Units
BigIntegerParallelMultiply.multiply 1000000 ss 4 68,043 ± 25,317 ms/op
BigIntegerParallelMultiply.multiply 10000000 ss 4 1073,095 ± 125,296 ms/op
BigIntegerParallelMultiply.multiply 100000000 ss 4 25317,535 ± 5806,205 ms/op
BigIntegerParallelMultiply.parallelMultiply 1000000 ss 4 56,552 ± 22,368 ms/op
BigIntegerParallelMultiply.parallelMultiply 10000000 ss 4 536,193 ± 37,393 ms/op
BigIntegerParallelMultiply.parallelMultiply 100000000 ss 4 9274,657 ± 826,197 ms/op
```
[Updated] PR is here: https://github.com/openjdk/jdk/pull/6409
BigInteger currently uses three different algorithms for multiply. The simple quadratic algorithm, then the slightly better Karatsuba if we exceed a bit count and then Toom Cook 3 once we go into the several thousands of bits. Since Toom Cook 3 is a recursive algorithm, it is trivial to parallelize it. I have demonstrated this several times in conference talks. In order to be consistent with other classes such as Arrays and Collection, I have added a parallelMultiply() method. Internally we have added a parameter to the private multiply method to indicate whether the calculation should be done in parallel.
The performance improvements are as should be expected. Fibonacci of 100 million (using a single-threaded Dijkstra's sum of squares version) completes in 9.2 seconds with the parallelMultiply() vs 25.3 seconds with the sequential multiply() method. This is on my 1-8-2 laptop. The final multiplications are with very large numbers, which then benefit from the parallelization of Toom-Cook 3. Fibonacci 100 million is a 347084 bit number.
We have also parallelized the private square() method. Internally, the square() method defaults to be sequential.
```
Benchmark (n) Mode Cnt Score Error Units
BigIntegerParallelMultiply.multiply 1000000 ss 4 68,043 ± 25,317 ms/op
BigIntegerParallelMultiply.multiply 10000000 ss 4 1073,095 ± 125,296 ms/op
BigIntegerParallelMultiply.multiply 100000000 ss 4 25317,535 ± 5806,205 ms/op
BigIntegerParallelMultiply.parallelMultiply 1000000 ss 4 56,552 ± 22,368 ms/op
BigIntegerParallelMultiply.parallelMultiply 10000000 ss 4 536,193 ± 37,393 ms/op
BigIntegerParallelMultiply.parallelMultiply 100000000 ss 4 9274,657 ± 826,197 ms/op
```
[Updated] PR is here: https://github.com/openjdk/jdk/pull/6409
- csr for
-
JDK-8278886 Add a parallel multiply method to BigInteger
- Closed
- relates to
-
JDK-8286363 BigInteger.parallelMultiply missing @since 19
- Resolved