The official Oracle documentation says:
Note that you may lose the benefits of parallelism if you use operations like forEachOrdered with parallel streams. Oracle - Parallelism
Why would anyone use forEachOrdered
with parallel stream if we are losing parallelism?
The Stream API makes it possible to execute a sequential stream in parallel without rewriting the code. The primary reason for using parallel streams is to improve performance while at the same time ensuring that the results obtained are the same, or at least compatible, regardless of the mode of execution.
Parallel Stream takes benefits of all available CPU cores and processes the tasks in parallel. If the number of tasks exceeds the number of cores, then remaining tasks wait for currently running task to complete.
The difference between forEachOrdered() and forEach() methods is that forEachOrdered() will always perform given action in encounter order of elements in stream whereas forEach() method is non-deterministic.
depending on the situation, one does not lose all the benefits of parallelism by using ForEachOrdered
.
Assume that we have something as such:
stringList.parallelStream().map(String::toUpperCase)
.forEachOrdered(System.out::println);
In this case, we can guarantee that the ForEachOrdered
terminal operation will print out the strings in uppercase in the encounter order but we should not assume that the elements will be passed to the map
intermediate operation in the same order they were picked for processing. The map
operation will be executed by multiple threads concurrently. So one may still benefit from parallelism but it's just that we’re not leveraging the full potential of parallelism. To conclude, we should use ForEachOrdered
when it matters to perform an action in the encounter order of the stream.
edit following your comment:
What happens when you skip
map
operation? I am more interested inforEachOrdered
right afterparallelStream()
if you're referring to something as in:
stringList.parallelStream().forEachOrdered(action);
there is no benefit in doing such thing and I doubt that's what the designers had in mind when they decided to create the method. in such case, it would make more sense to do:
stringList.stream().forEach(action);
to extend on your question "Why would anyone use forEachOrdered with parallel stream if we are losing parallelism", say you wanted to perform an action on each element with respect to the streams encounter order; in such case you will need to use forEachOrdered
as the forEach
terminal operation is non deterministic when used in parallel hence there is one version for sequential streams and one specifically for parallel streams.
I don't really get the question here. Why? because you simply have no alternative - you have so much data that parallel streams will help you (this still needs to be proven); but yet you still need to preserve the order - thus forEachOrdered
. Notice that the documentation says may and not will lose that for sure - you would have to measure and see.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With