Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intermediate stream operations not evaluated on count

It seems I'm having trouble understanding how Java composes stream operations into a stream pipeline.

When executing the following code

public  static void main(String[] args) {     StringBuilder sb = new StringBuilder();      var count = Stream.of(new String[]{"1", "2", "3", "4"})             .map(sb::append)             .count();      System.out.println(count);     System.out.println(sb.toString()); } 

The console only prints 4. The StringBuilder object still has the value "".

When I add the filter operation: filter(s -> true)

public static void main(String[] args) {     StringBuilder sb = new StringBuilder();      var count = Stream.of(new String[]{"1", "2", "3", "4"})             .filter(s -> true)             .map(sb::append)             .count();      System.out.println(count);     System.out.println(sb.toString()); } 

The output changes to:

4 1234 

How does this seemingly redundant filter operation change the behavior of the composed stream pipeline?

like image 586
atalantus Avatar asked Jan 04 '20 15:01

atalantus


People also ask

What are intermediate operations in streams?

Intermediate Operation- These operations are used to pipeline other methods and to transform into the other streams. They don't produce results because these operation does not invoke until the terminal operation gets executed.

Is count a terminal operation on stream?

count() The Java Stream count() method is a terminal operation which starts the internal iteration of the elements in the Stream , and counts the elements.

Why intermediate operations are lazy?

Streams are lazy because intermediate operations are not evaluated unless terminal operation is invoked. Each intermediate operation creates a new stream, stores the provided operation/function and return the new stream.

Which terminal operation can be used to count the number of items in the stream?

Stream count() API The Stream interface has a default method called count() that returns a long value indicating the number of matching items in the stream. To use the count() method, call it on any Stream instance.


2 Answers

The count() terminal operation, in my version of the JDK, ends up executing the following code:

if (StreamOpFlag.SIZED.isKnown(helper.getStreamAndOpFlags()))     return spliterator.getExactSizeIfKnown(); return super.evaluateSequential(helper, spliterator); 

If there is a filter() operation in the pipeline of operations, the size of the stream, which is known initially, can't be known anymore (since filter could reject some elements of the stream). So the if block is not executed, the intermediate operations are executed and the StringBuilder is thus modified.

On the other hand, If you only have map()in the pipeline, the number of elements in the stream is guaranteed to be the same as the initial number of elements. So the if block is executed, and the size is returned directly without evaluating the intermediate operations.

Note that the lambda passed to map() violates the contract defined in the documentation: it's supposed to be a non-interfering, stateless operation, but it is not stateless. So having a different result in both cases can't be considered as a bug.

like image 144
JB Nizet Avatar answered Oct 02 '22 11:10

JB Nizet


In jdk-9 it was clearly documented in java docs

The eliding of side-effects may also be surprising. With the exception of terminal operations forEach and forEachOrdered, side-effects of behavioral parameters may not always be executed when the stream implementation can optimize away the execution of behavioral parameters without affecting the result of the computation. (For a specific example see the API note documented on the count operation.)

API Note:

An implementation may choose to not execute the stream pipeline (either sequentially or in parallel) if it is capable of computing the count directly from the stream source. In such cases no source elements will be traversed and no intermediate operations will be evaluated. Behavioral parameters with side-effects, which are strongly discouraged except for harmless cases such as debugging, may be affected. For example, consider the following stream:

 List<String> l = Arrays.asList("A", "B", "C", "D");  long count = l.stream().peek(System.out::println).count(); 

The number of elements covered by the stream source, a List, is known and the intermediate operation, peek, does not inject into or remove elements from the stream (as may be the case for flatMap or filter operations). Thus the count is the size of the List and there is no need to execute the pipeline and, as a side-effect, print out the list elements.

like image 45
Deadpool Avatar answered Oct 02 '22 13:10

Deadpool