The Javadoc for Stream.forEach
says (emphasis mine):
The behavior of this operation is explicitly nondeterministic. For parallel stream pipelines, this operation does not guarantee to respect the encounter order of the stream, as doing so would sacrifice the benefit of parallelism. For any given element, the action may be performed at whatever time and in whatever thread the library chooses. If the action accesses shared state, it is responsible for providing the required synchronization.
The same text is present in the Java 9 Early Access Javadoc.
The first sentence ("explicitly nondeterministic") suggests (but doesn't explicitly say) that encounter order is not preserved by this method. But the next sentence, that explicitly says order is not preserved, is conditioned on "For parallel stream pipelines", and that condition would be unnecessary if the sentence applied regardless of parallelism. That leaves me unsure whether forEach preserves encounter order for sequential streams.
This answer points out a spot where the streams library implementation calls .sequential().forEach(downstream)
. That suggests forEach is intended to preserve order for sequential streams, but could also just be a bug in the library.
I've sidestepped this ambiguity in my own code by using forEachOrdered
to be on the safe side, but today I discovered that NetBeans IDE's "use functional operations" editor hint will convert
for (Foo foo : collection) foo.bar();
into
collection.stream().forEach((foo) -> { foo.bar(); });
which introduces a bug if forEach does not preserve encounter order. Before I report a bug against NetBeans, I want to know what the library actually guarantees, backed up by a source.
I'm looking for an answer drawing from authoritative sources. That could be an explicit comment in the library implementation, discussion on the Java development mailing lists (Google didn't find anything for me but maybe I don't know the magic words), or a statement from the library designers (of which I know two, Brian Goetz and Stuart Marks, are active on Stack Overflow). (Please do not answer with "just use forEachOrdered instead" -- I already do, but I want to know if code that doesn't is wrong.)
Stream forEachOrdered(Consumer action) performs an action for each element of this stream, in the encounter order of the stream if the stream has a defined encounter order. Stream forEachOrdered(Consumer action) is a terminal operation i.e, it may traverse the stream to produce a result or a side-effect.
stream(). filter() . forEachOrdered() , all elements will be processed sequentially in order, whereas for list.
A parallel stream is performed one or more elements at a time. Thus the map() would preserve the encounter of the stream order but not the original List's order.
If our Stream is ordered, it doesn't matter whether our data is being processed sequentially or in parallel; the implementation will maintain the encounter order of the Stream.
Specifications exist to describe the minimal guarantees a caller can depend on, not to describe what the implementation does. This gap is crucial, as it allows the implementation flexibility to evolve. (Specification is declarative; implementation is imperative.) Overspecification is just as bad as underspecification.
When a specification says "does not preserve property X", it does not mean that the property X may never be observed; it means the implementation is not obligated to preserve it. Your claimed implication that encounter order is never preserved is simply a wrong conclusion. (HashSet
doesn't promise that iterating its elements preserves the order they were inserted, but that doesn't mean this can't accidentally happen -- you just can't count on it.)
Similarly, your implication of "that suggests forEach is intended to preserve order for sequential streams" because you saw an implementation that does so in some case is equally incorrect.
In both cases, it seems like you're just uncomfortable with the fact that the specification gives forEach
a great deal of freedom. Specifically, it has the freedom to not preserve encounter order for sequential streams, even though that's what the implementation currently does, and further that it's kind of hard to imagine an implementation going out of its way to process sequential sources out of order. But that's what the spec says, and that's what it was intended to say.
That said, the wording of the comment about parallel streams is potentially confusing, because it is still possible to misinterpret it. The intent of calling out the parallel case explicitly here was pedagogical; the spec is still perfectly clear with that sentence removed entirely. However, to a reader who is unaware of parallelism, it would be almost impossible to not assume that forEach
would preserve encounter order, so this sentence was added to help clarify the motivation. But, as you point out, the desire to treat the sequential case specially is still so powerful that it would be beneficial to clarify further.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With