Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is an unordered stream faster than an ordered one?

I'm reading Java 8 book by Richard Warburton and came up with this:

Some operation are more expensive on ordered stream. This problem can be solved by eliminating ordering. To do so, call the stream's unordered method. [...]

I was quite perplexed by that. Suppose we have Stream<Integer> stream = Arrays.asList(1, 2, 3, 4).stream();

Since List<Integer> defines an encounter order for the stream (some) operation might be performed inefficiently. Why is that?

How does it affect the processing and what makes it slow? In order to make things faster, in that case, should we be calling this as

Stream<Integer> stream = Arrays.asList(1, 2, 3, 4).stream().unordered();

? Sounds strange, to say the least...

like image 390
user3663882 Avatar asked May 14 '16 10:05

user3663882


People also ask

Is parallel stream faster than stream?

The performance of both streams degrades fast when the number of values increases. However, the parallel stream performs worse than the sequential stream in all cases.

Does stream guarantee order?

If you have an ordered stream and perform operations which guarantee to maintain the order, it doesn't matter whether the stream is processed in parallel or sequential; the implementation will maintain the order. The ordered property is distinct from parallel vs sequential.

Does Java stream change order?

While most intermediate operations will maintain the order of the Stream, some will, by their nature, change it. unordered and empty are two more examples of intermediate operations that will ultimately change the ordering of a Stream.

Does Java stream distinct preserve order?

Java Stream distinct() MethodIf the stream is ordered, the encounter order is preserved. It means that the element occurring first will be present in the distinct elements stream. If the stream is unordered, then the resulting stream elements can be in any order.


1 Answers

This is explained in detail in the documentation: https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html

Ordering
Streams may or may not have a defined encounter order. Whether or not a stream has an encounter order depends on the source and the intermediate operations. Certain stream sources (such as List or arrays) are intrinsically ordered, whereas others (such as HashSet) are not. Some intermediate operations, such as sorted(), may impose an encounter order on an otherwise unordered stream, and others may render an ordered stream unordered, such as BaseStream.unordered(). Further, some terminal operations may ignore encounter order, such as forEach().

If a stream is ordered, most operations are constrained to operate on the elements in their encounter order; if the source of a stream is a List containing [1, 2, 3], then the result of executing map(x -> x*2) must be [2, 4, 6]. However, if the source has no defined encounter order, then any permutation of the values [2, 4, 6] would be a valid result. For sequential streams, the presence or absence of an encounter order does not affect performance, only determinism. If a stream is ordered, repeated execution of identical stream pipelines on an identical source will produce an identical result; if it is not ordered, repeated execution might produce different results.

For parallel streams, relaxing the ordering constraint can sometimes enable more efficient execution. Certain aggregate operations, such as filtering duplicates (distinct()) or grouped reductions (Collectors.groupingBy()) can be implemented more efficiently if ordering of elements is not relevant. Similarly, operations that are intrinsically tied to encounter order, such as limit(), may require buffering to ensure proper ordering, undermining the benefit of parallelism. In cases where the stream has an encounter order, but the user does not particularly care about that encounter order, explicitly de-ordering the stream with unordered() may improve parallel performance for some stateful or terminal operations. However, most stream pipelines, such as the "sum of weight of blocks" example above, still parallelize efficiently even under ordering constraints.

like image 84
krokodilko Avatar answered Oct 10 '22 03:10

krokodilko