-
Bug
-
Resolution: Won't Fix
-
P4
-
None
-
8, 11, 17, 21, 22
-
generic
-
generic
ADDITIONAL SYSTEM INFORMATION :
OS: Windows 10
Java: Corretto JDK 1.8.262
A DESCRIPTION OF THE PROBLEM :
The streams fed to `Stream.flatMap(Function)` are eagerly streamed when the iterator returned from `Stream.iterator()` is first queried for elements w/ `iterator.hasNext()`. This is particularly problematic when the "source" streams will traverse millions of objects; the iterator fills the heap w/ the internal SpinedBuffer used by the flattened stream, which leads to OoM issues.
It appears this was *almost* addressed in a backport I found here: https://bugs.openjdk.org/browse/JDK-8225328. If I'm reading it right, this was patched in 1.8.222; I can see the changes when I read the source code on my machine. This was supposed to address terminal operations "cancelling" the flattened stream early, but I don't think it covered the case where an `.iterator()` call was made on the stream.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run unit tests provided in source code attribute.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Tests ought to pass.
ACTUAL -
The streamTerminated test ought to pass, b/c the original largeElements stream shouldn't be used until the flatItr stream is advanced. It fails b/c hasNext() reads all elements.
Similarly, iteratorExhausted ought to pass b/c the new iterator has not yet been used; it should not have consumed any elements from largeElements. It fails b/c flatItr.hasNext() consumed all source elements into a SpinedBuffer.
---------- BEGIN SOURCE ----------
import org.junit.Assert;
import org.junit.Test;
import java.util.Arrays;
import java.util.Collection;
import java.util.Iterator;
import java.util.Spliterator;
import java.util.Spliterators;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;
public class TestCase {
@Test
public void streamTerminated () {
Stream<?> largeElements = Stream.of(1, 2, 3);
Collection<Stream<?>> itrList = Arrays.asList(largeElements);
Iterator<?> flatItr = itrList.stream()
.flatMap(s -> s)
.iterator();
Assert.assertTrue("Flattened doesn't hasNext()", flatItr.hasNext()); // Passes
largeElements.collect(Collectors.toList()); // Exception
}
@Test
public void iteratorExhausted () {
Iterator<?> largeElements = Arrays.asList(1, 2, 3).iterator();
Collection<Iterator<?>> itrList = Arrays.asList(largeElements);
Iterator<?> flatItr = itrList.stream()
.flatMap(itr -> StreamSupport
.stream(
Spliterators.spliteratorUnknownSize(
itr,
Spliterator.ORDERED),
false)
.sequential())
.iterator();
Assert.assertTrue(
"Original iterator exhausted early",
largeElements.hasNext()); // Passes
Assert.assertTrue("Flattened doesn't hasNext()", flatItr.hasNext()); // Passes
Assert.assertTrue(
"Original iterator exhausted after hasNext()",
largeElements.hasNext()); // Fails
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
The only workaround I have it to just *not* use Stream.iterator(). If I want an iterator of iterators (e.g. a "concatenated" iterator), I can create a special Iterator implementation that internally manages an iterator of iterators.
FREQUENCY : always
OS: Windows 10
Java: Corretto JDK 1.8.262
A DESCRIPTION OF THE PROBLEM :
The streams fed to `Stream.flatMap(Function)` are eagerly streamed when the iterator returned from `Stream.iterator()` is first queried for elements w/ `iterator.hasNext()`. This is particularly problematic when the "source" streams will traverse millions of objects; the iterator fills the heap w/ the internal SpinedBuffer used by the flattened stream, which leads to OoM issues.
It appears this was *almost* addressed in a backport I found here: https://bugs.openjdk.org/browse/JDK-8225328. If I'm reading it right, this was patched in 1.8.222; I can see the changes when I read the source code on my machine. This was supposed to address terminal operations "cancelling" the flattened stream early, but I don't think it covered the case where an `.iterator()` call was made on the stream.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run unit tests provided in source code attribute.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Tests ought to pass.
ACTUAL -
The streamTerminated test ought to pass, b/c the original largeElements stream shouldn't be used until the flatItr stream is advanced. It fails b/c hasNext() reads all elements.
Similarly, iteratorExhausted ought to pass b/c the new iterator has not yet been used; it should not have consumed any elements from largeElements. It fails b/c flatItr.hasNext() consumed all source elements into a SpinedBuffer.
---------- BEGIN SOURCE ----------
import org.junit.Assert;
import org.junit.Test;
import java.util.Arrays;
import java.util.Collection;
import java.util.Iterator;
import java.util.Spliterator;
import java.util.Spliterators;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;
public class TestCase {
@Test
public void streamTerminated () {
Stream<?> largeElements = Stream.of(1, 2, 3);
Collection<Stream<?>> itrList = Arrays.asList(largeElements);
Iterator<?> flatItr = itrList.stream()
.flatMap(s -> s)
.iterator();
Assert.assertTrue("Flattened doesn't hasNext()", flatItr.hasNext()); // Passes
largeElements.collect(Collectors.toList()); // Exception
}
@Test
public void iteratorExhausted () {
Iterator<?> largeElements = Arrays.asList(1, 2, 3).iterator();
Collection<Iterator<?>> itrList = Arrays.asList(largeElements);
Iterator<?> flatItr = itrList.stream()
.flatMap(itr -> StreamSupport
.stream(
Spliterators.spliteratorUnknownSize(
itr,
Spliterator.ORDERED),
false)
.sequential())
.iterator();
Assert.assertTrue(
"Original iterator exhausted early",
largeElements.hasNext()); // Passes
Assert.assertTrue("Flattened doesn't hasNext()", flatItr.hasNext()); // Passes
Assert.assertTrue(
"Original iterator exhausted after hasNext()",
largeElements.hasNext()); // Fails
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
The only workaround I have it to just *not* use Stream.iterator(). If I want an iterator of iterators (e.g. a "concatenated" iterator), I can create a special Iterator implementation that internally manages an iterator of iterators.
FREQUENCY : always
- relates to
-
JDK-8225328 Stream.flatMap() causes breaking of short-circuiting of terminal operations
-
- Resolved
-