Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is .collect guaranteed to be ordered on parallel streams?

Tags:

java

list

java-8

Given I have a list of Strings List<String> toProcess. The results have to be in the order the original lines were given. I want to utilize the new parallel streams.

Does the following code guarantee that the results will be in the same order they were in the original list?

// ["a", "b", "c"] List<String> toProcess;  // should be ["a", "b", "c"] List<String> results = toProcess.parallelStream()                                 .map(s -> s)                                 .collect(Collectors.toList()); 
like image 880
JFBM Avatar asked Apr 17 '15 23:04

JFBM


People also ask

Does parallel stream maintain order?

parallel whereas you want to process items in order, so you have to ask about ordering. If you have an ordered stream and perform operations which guarantee to maintain the order, it doesn't matter whether the stream is processed in parallel or sequential; the implementation will maintain the order.

Does Java parallel stream maintain order?

If our Stream is ordered, it doesn't matter whether our data is being processed sequentially or in parallel; the implementation will maintain the encounter order of the Stream.

What is true about parallel streams?

Parallel streams enable us to execute code in parallel on separate cores. The final result is the combination of each individual outcome.

Which condition is to be satisfied to process a stream in parallel?

It is important to ensure that the result of the parallel stream is the same as is obtained through the sequential stream, so the parallel streams must be stateless, non-interfering, and associative.


1 Answers

TL;DR

Yes, the order is guaranteed.

Stream.collect() API documentation

The starting place is to look at what determines whether a reduction is concurrent or not. Stream.collect()'s description says the following:

If the stream is parallel, and the Collector is concurrent, and either the stream is unordered or the collector is unordered, then a concurrent reduction will be performed (see Collector for details on concurrent reduction.)

The first condition is satisfied: the stream is parallel. How about the second and third: is the Collector concurrent and unordered?
 

Collectors.toList() API documentation

toList()'s documentation reads:

Returns a Collector that accumulates the input elements into a new List. There are no guarantees on the type, mutability, serializability, or thread-safety of the List returned; if more control over the returned List is required, use toCollection(Supplier).

Returns:
a Collector which collects all the input elements into a List, in encounter order

An operation that works in encounter order operates on the elements in their original order. This overrides parallelness.
 

Implementation code

Inspecting the implementation of Collectors.java confirms that toList() does not include the CONCURRENT or UNORDERED traits.

public static <T> Collector<T, ?, List<T>> toList() {     return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new, List::add,                                (left, right) -> { left.addAll(right); return left; },                                CH_ID); }  // ...  static final Set<Collector.Characteristics> CH_ID         = Collections.unmodifiableSet(EnumSet.of(Collector.Characteristics.IDENTITY_FINISH)); 

Notice how the collector has the CH_ID trait set, which has only the single IDENTITY_FINISH trait. CONCURRENT and UNORDERED are not there, so the reduction cannot be concurrent.

A non-concurrent reduction means that, if the stream is parallel, collection can proceed in parallel, but it will be split into several thread-confined intermediate results which are then combined. This ensures the combined result is in encounter order.
 

See also: Why parallel stream get collected sequentially in Java 8

like image 170
John Kugelman Avatar answered Sep 21 '22 12:09

John Kugelman