-
Bug
-
Resolution: Unresolved
-
P4
-
8
-
None
The parallel mutable collecting of elements of a stream to a List:
s.parallel().collect(toList())
is not very efficient and is implemented as follows:
public static <T>
Collector<T, ?, List<T>> toList() {
return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new, List::add,
(left, right) -> { left.addAll(right); return left; },
CH_ID);
}
Specifically the merging:
(left, right) -> { left.addAll(right); return left; },
copies all elements from the right-hand list into the left-hand list.
Similarly for maps the merging is implemented as follows:
private static <K, V, M extends Map<K,V>>
BinaryOperator<M> mapMerger(BinaryOperator<V> mergeFunction) {
return (m1, m2) -> {
for (Map.Entry<K,V> e : m2.entrySet())
m1.merge(e.getKey(), e.getValue(), mergeFunction);
return m1;
};
}
Both these merging implementations are very inefficient and can reduce the amount of parallel speed up due to copying of memory and garbage collection.
Mechanisms need to be explored to reduce the cost of merging using structural sharing. The list/set/map collectors make no guarantees on the actual concrete implementation and mutability thus there is scope to use alternative implementations that may be built up via accumulation and merging and then transformed.
s.parallel().collect(toList())
is not very efficient and is implemented as follows:
public static <T>
Collector<T, ?, List<T>> toList() {
return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new, List::add,
(left, right) -> { left.addAll(right); return left; },
CH_ID);
}
Specifically the merging:
(left, right) -> { left.addAll(right); return left; },
copies all elements from the right-hand list into the left-hand list.
Similarly for maps the merging is implemented as follows:
private static <K, V, M extends Map<K,V>>
BinaryOperator<M> mapMerger(BinaryOperator<V> mergeFunction) {
return (m1, m2) -> {
for (Map.Entry<K,V> e : m2.entrySet())
m1.merge(e.getKey(), e.getValue(), mergeFunction);
return m1;
};
}
Both these merging implementations are very inefficient and can reduce the amount of parallel speed up due to copying of memory and garbage collection.
Mechanisms need to be explored to reduce the cost of merging using structural sharing. The list/set/map collectors make no guarantees on the actual concrete implementation and mutability thus there is scope to use alternative implementations that may be built up via accumulation and merging and then transformed.
- relates to
-
JDK-8072718 add Map.mergeAll method, default implementation, and overrides in concrete classes
-
- Open
-