Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are Java streams stages sequential?

I have a question on the intermediate stages sequential state - are the operations from a stage applied to all the input stream (items) or are all the stages / operations applied to each stream item?

I'm aware the question might not be easy to understand, so I'll give an example. On the following stream processing:

List<String> strings = Arrays.asList("Are Java streams intermediate stages sequential?".split(" "));
strings.stream()
           .filter(word -> word.length() > 4)
           .peek(word -> System.out.println("f: " + word))
           .map(word -> word.length())
           .peek(length -> System.out.println("m: " + length))
           .forEach(length -> System.out.println("-> " + length + "\n"));

My expectation for this code is that it will output:

f: streams
f: intermediate
f: stages
f: sequential?

m: 7
m: 12
m: 6
m: 11

-> 7
-> 12
-> 6
-> 11

Instead, the output is:

f: streams
m: 7
-> 7

f: intermediate
m: 12
-> 12

f: stages
m: 6
-> 6

f: sequential?
m: 11
-> 11

Are the items just displayed for all the stages, due to the console output? Or are they also processed for all the stages, one at a time?

I can further detail the question, if it's not clear enough.

like image 574
Bogdan Solga Avatar asked Jul 01 '17 15:07

Bogdan Solga


People also ask

Are Java streams sequential?

Sequential Streams. By default, any stream operation in Java is processed sequentially, unless explicitly specified as parallel.

Does stream change order Java?

While most intermediate operations will maintain the order of the Stream, some will, by their nature, change it. unordered and empty are two more examples of intermediate operations that will ultimately change the ordering of a Stream.

What are sequential vs parallel streams in Java?

A sequential stream is executed in a single thread running on one CPU core. The elements in the stream are processed sequentially in a single pass by the stream operations that are executed in the same thread. A parallel stream is executed by different threads, running on multiple CPU cores in a computer.

Is Java 8 support parallel and sequential streams?

Parallel streams divide the provided task into many and run them in different threads, utilizing multiple cores of the computer. On the other hand sequential streams work just like for-loop using a single core.


2 Answers

This behaviour enables optimisation of the code. If each intermediate operation were to process all elements of a stream before proceeding to the next intermediate operation then there would be no chance of optimisation.

So to answer your question, each element moves along the stream pipeline vertically one at a time (except for some stateful operations discussed later), therefore enabling optimisation where possible.

Explanation

Given the example you've provided, each element will move along the stream pipeline vertically one by one as there is no stateful operation included.

Another example, say you were looking for the first String whose length is greater than 4, processing all the elements prior to providing the result is unnecessary and time-consuming.

Consider this simple illustration:

List<String> stringsList = Arrays.asList("1","12","123","1234","12345","123456","1234567");
int result = stringsList.stream()
                        .filter(s -> s.length() > 4)
                        .mapToInt(Integer::valueOf)
                        .findFirst().orElse(0);

The filter intermediate operation above will not find all the elements whose length is greater than 4 and return a new stream of them but rather what happens is as soon as we find the first element whose length is greater than 4, that element goes through to the .mapToInt which then findFirst says "I've found the first element" and execution stops there. Therefore the result will be 12345.

Behaviour of stateful and stateless intermediate operations

Note that when a stateful intermediate operation as such of sorted is included in a stream pipeline then that specific operation will traverse the entire stream. If you think about it, this makes complete sense as in order to sort elements you'll need to see all the elements to determine which elements come first in the sort order.

The distinct intermediate operation is also a stateful operation, however, as @Holger has mentioned unlike sorted, it does not require traversing the entire stream as each distinct element can get passed down the pipeline immediately and may fulfil a short-circuiting condition.

stateless intermediate operations such as filter , map etc do not have to traverse the entire stream and can freely process one element at a time vertically as mentioned above.

Lastly, but not least it's also important to note that, when the terminal operation is a short-circuiting operation the terminal-short-circuiting methods can finish before traversing all the elements of the underlying stream.

reading: Java 8 stream tutorial

like image 100
Ousmane D. Avatar answered Sep 23 '22 11:09

Ousmane D.


Your answer is loop fusion. What we see is that the four intermediate operations filter() – peek() – map() – peek() – println using forEach() which is a kinda terminal operation have been logically joined together to constitute a single pass. They are executed in order for each of the individual element. This joining together of operations in a single pass is an optimization technique known as loop fusion.

More for reading: Source

like image 45
snr Avatar answered Sep 23 '22 11:09

snr