Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The Difference between Parallel and Sequential Stream in terms of Java 1.8

What is the functional difference between sequential and parallel stream in terms of Java 1.8, and how the output will be getting affected?

And in which scenarios to choose parallel or sequential stream?

What will be processing method difference for Sequential and Parallel Stream in Java?!!

I have tried below snippet to test it with the small amount of data, I didn't get any exceptional difference in output.!!

ArrayList<Integer> arrayList = new ArrayList<>();
for(int i = 1; i <= 100;i++) arrayList.add(i);

arrayList.stream().filter(l -> l > 90).forEach(l -> System.out.println(l));

arrayList.parallelStream().filter(l -> l > 90).forEach(l -> System.out.println(l));
like image 363
Hardik Patel Avatar asked Dec 24 '17 13:12

Hardik Patel


People also ask

What is are the correct difference between parallel stream and sequential stream?

In the case of a sequential stream, the content of the list is printed in an ordered sequence. The output of the parallel stream, on the other hand, is unordered and the sequence changes every time the program is run.

Is Java 8 support parallel and sequential streams?

Java 8 introduced the Stream API that makes it easy to iterate over collections as streams of data. It's also very easy to create streams that execute in parallel and make use of multiple processor cores. We might think that it's always faster to divide the work on more cores.

What is meant by parallel stream in Java 8?

Java Parallel Streams is a feature of Java 8 and higher, meant for utilizing multiple cores of the processor. Normally any java code has one stream of processing, where it is executed sequentially.


4 Answers

For your concrete example, you got lucky not to see any difference (add the loop to 101 so that elements are distributed a bit worse among threads and see the difference) forEach is documented as:

The behavior of this operation is explicitly nondeterministic

So for parallel processing at least, there will be no order - at least in the sense none that you can rely on. There is forEachOrdered that does guarantee the order - in case you need it.

Choosing parallel or sequential is not easy - you should measure, Brian's advices are the best to read here

like image 72
Eugene Avatar answered Oct 17 '22 22:10

Eugene


Since you are creating a parallel stream, it is possible for the elements of the stream to be processed by different threads. A parallel stream allows multiple threads to work on sections of a stream independently. The code where you are using parallelStream() illustrates how you can take advantage of multiple cores.

You cannot see a big difference when you are using parallelStream() on 100 elements. You need to have more then that.

Talking about ordering, there are also 2 ways of achieving that, using forEach and forEachOrdered. The difference between them is that forEach will allow any element of a parallel stream to be processed in any order, while forEachOrdered will always process the elements of a parallel stream in the order of their appearance in the original stream. Therefore, in this case, if you leave forEach as it is, there is no guarantee regarding order.

like image 24
Alex Mamo Avatar answered Oct 17 '22 23:10

Alex Mamo


Generally, a parallel stream is basically a stream that partitions its elements into multiple chunks, processing each chunk with a different thread. Therefore, you can automatically partition the workload of a given operation on all the cores of your multicore processor and keep all of them equally busy.

However, it's important to note that just by invoking parallelStream() doesn't necessarily make the stream parallel, in fact, invoking this method might even return a sequential stream rather a parallel one.

as stated in the java doc:

default Stream<E> parallelStream()

Returns a possibly parallel Stream with this collection as its source. It is allowable for this method to return a sequential stream.

Therefore we can conclude it's up to the library to determine whether it's appropriate to utilise multiple threads. in most cases, this will be the case when there is a huge amount of data to process.

as in your case, there seems to be only 100 elements within the ArrayList hence there is no difference whether you utilise parallelStream() or not.

Lastly, but not least I'd always use a sequential stream to process data in a sequential manner except in cases where there is a huge amount of data to process or when you're experiencing performance issues processing data with a sequential stream in which case you can switch to a parallelStream.

like image 27
Ousmane D. Avatar answered Oct 17 '22 23:10

Ousmane D.


The docs of Stream state that parallel is a property of the stream, but don't add much about implementation specification.

The difference is in the execution of the declarative operations on the stream. In most cases, the difference doesn't show unless it matters as far as the result is concerned.

The best explanation of the difference can probably be found in the forEach terminal stream method that you're calling. The docs for Stream.forEach stipulate:

The behavior of this operation is explicitly nondeterministic. For parallel stream pipelines, this operation does not guarantee to respect the encounter order of the stream, as doing so would sacrifice the benefit of parallelism. For any given element, the action may be performed at whatever time and in whatever thread the library chooses. If the action accesses shared state, it is responsible for providing the required synchronization.

In other words, the sequential stream guarantees order at the expense of concurrency. That's just among other things.

like image 30
ernest_k Avatar answered Oct 18 '22 00:10

ernest_k