-
Bug
-
Resolution: Fixed
-
P4
-
8, 11, 12, 17
-
b14
-
x86_64
-
windows_10
-
Verified
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8274015 | 17.0.2 | Joe Darcy | P4 | Resolved | Fixed | b01 |
JDK-8275116 | 11.0.15-oracle | Ravi Reddy | P4 | Resolved | Fixed | b01 |
JDK-8278493 | 11.0.15 | Goetz Lindenmaier | P4 | Resolved | Fixed | b01 |
DoubleStream.sum and related functions (Collectors.averagingDouble, Collectors.summingDouble and possibly others) all use Kahan summation to reduce numerical error. As I understand it, the implementations of these function use a double array where index 0 holds the high-order bits of the running, and index 1 holds the negation of the low-order bits. The documentation incorrectly states that index 1 holds the lower order bits (no negaton), and when combining two running sums incorrectly adds the negation of the low-order bits.
This problem appears in OpenJDK 8 and 11. I think that in https://hg.openjdk.java.net/jdk/jdk/file/8613f3fdbdae/src/java.base/share/classes/java/util/stream/DoublePipeline.java, line 432 should be changed to
Collectors.sumWithCompensation(ll, -rr[1]);
and in https://hg.openjdk.java.net/jdk/jdk/file/8613f3fdbdae/src/java.base/share/classes/java/util/stream/Collectors.java, line 729 should be changed to
return sumWithCompensation(a, -b[1]); },
and line 841 should be changed to
(a, b) -> { sumWithCompensation(a, b[0]); sumWithCompensation(a, -b[1]); a[2] += b[2]; a[3] += b[3]; return a; },
Alternatively, sumWithCompensation could be altered instead.
I've attached test code below, comparing the result of sum() using the current implementation, and the result using the flipped sign. The results of the sequential and niave sum are given for comparison, and the sum of the squared errors (relative to a base case using sequential Kahan summation) are printed. With the random inputs used, the sum of squared errors is consistently lower with the proposed sign flip.
---------- BEGIN SOURCE ----------
package test;
import java.util.Random;
import java.util.stream.DoubleStream;
public class TestSum {
public static void main(String [] args) {
double naive = 0;
double sequentialStream = 0;
double parallelStream = 0;
double mySequentialStream = 0;
double myParallelStream = 0;
for (int loop = 0; loop < 100; loop++) {
// sequence of random numbers of varying magnitudes, both positive and negative
double[] rand = new Random().doubles(1_000_000)
.map(Math::log)
.map(x -> (Double.doubleToLongBits(x) % 2 == 0) ? x : -x)
.toArray();
// base case: standard Kahan summation
double[] sum = new double[2];
for (int i=0; i < rand.length; i++) {
sumWithCompensation(sum, rand[i]);
}
// squared error of naive sum by reduction - should be large
naive += Math.pow(DoubleStream.of(rand).reduce((x, y) -> x+y).getAsDouble() - sum[0], 2);
// squared error of sequential sum - should be 0
sequentialStream += Math.pow(DoubleStream.of(rand).sum() - sum[0], 2);
// squared error of parallel sum
parallelStream += Math.pow(DoubleStream.of(rand).parallel().sum() - sum[0], 2);
// squared error of modified sequential sum - should be 0
mySequentialStream += Math.pow(computeFinalSum(DoubleStream.of(rand).collect(
() -> new double[3],
(ll, d) -> {
sumWithCompensation(ll, d);
ll[2] += d;
},
(ll, rr) -> {
sumWithCompensation(ll, rr[0]);
sumWithCompensation(ll, -rr[1]); // minus is added
ll[2] += rr[2];
})) - sum[0], 2);
// squared error of modified parallel sum - typically ~0.25-0.5 times squared error of parallel sum
myParallelStream += Math.pow(computeFinalSum(DoubleStream.of(rand).parallel().collect(
() -> new double[3],
(ll, d) -> {
sumWithCompensation(ll, d);
ll[2] += d;
},
(ll, rr) -> {
sumWithCompensation(ll, rr[0]);
sumWithCompensation(ll, -rr[1]); // minus is added
ll[2] += rr[2];
})) - sum[0], 2);
}
// print sum of squared errors
System.out.println(naive);
System.out.println(sequentialStream);
System.out.println(parallelStream);
System.out.println(mySequentialStream);
System.out.println(myParallelStream);
}
// from OpenJDK8 Collectors, unmodified
static double[] sumWithCompensation(double[] intermediateSum, double value) {
double tmp = value - intermediateSum[1];
double sum = intermediateSum[0];
double velvel = sum + tmp; // Little wolf of rounding error
intermediateSum[1] = (velvel - sum) - tmp;
intermediateSum[0] = velvel;
return intermediateSum;
}
// from OpenJDK8 Collectors, unmodified
static double computeFinalSum(double[] summands) {
double tmp = summands[0] + summands[1];
double simpleSum = summands[summands.length - 1];
if (Double.isNaN(tmp) && Double.isInfinite(simpleSum))
return simpleSum;
else
return tmp;
}
}
---------- END SOURCE ----------
FREQUENCY : always
- backported by
-
JDK-8274015 Bug in parallel Kahan summation implementation
- Resolved
-
JDK-8275116 Bug in parallel Kahan summation implementation
- Resolved
-
JDK-8278493 Bug in parallel Kahan summation implementation
- Resolved
- relates to
-
JDK-8006572 DoubleStream.sum() & DoubleSummaryStats implementations that reduce numerical errors
- Closed
-
JDK-8273514 java/util/DoubleStreamSums/CompensatedSums.java failure
- Resolved
-
JDK-8274517 java/util/DoubleStreamSums/CompensatedSums.java fails with expected [true] but found [false]
- Closed
- links to
-
Commit openjdk/jdk11u-dev/a481c9a6
-
Commit openjdk/jdk17u/793d08a7
-
Commit openjdk/jdk/dd871819
-
Review openjdk/jdk11u-dev/707
-
Review openjdk/jdk17u/81
-
Review openjdk/jdk/4674