Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java stream operation order of execution at terminal point [duplicate]

I have been trying to find clear contract from official Java documentation as to in which order Java streams, once a terminal operation is called, process the elements and call intermediate operations.

For example lets look at these examples that use both Java stream version and plain iteration version (both producing same outcome) .

Example1:

    List<Integer> ints = Arrays.asList(1, 2, 3, 4, 5);
    Function<Integer, Integer> map1 = i -> i;
    Predicate<Integer> f1 = i -> i > 2;

    public int findFirstUsingStreams(List<Integer> ints){
        return ints.stream().map(map1).filter(f1).findFirst().orElse(-1);
    }

    public int findFirstUsingLoopV1(List<Integer> ints){
        for (int i : ints){
            int mappedI = map1.apply(i);
            if ( f1.test(mappedI) ) return mappedI;
        }
        return -1;
    }

    public int findFirstUsingLoopV2(List<Integer> ints){
        List<Integer> mappedInts = new ArrayList<>( ints.size() );

        for (int i : ints){
            int mappedI = map1.apply(i);
            mappedInts.add(mappedI);
        }

        for (int mappedI : mappedInts){
            if ( f1.test(mappedI) ) return mappedI;
        }
        return -1;
    }

Will the Java stream in findFirstUsingStreams method after findFirst is called run map1 in the order described in findFirstUsingLoopV1 (map is not run for all elements) or as described in findFirstUsingLoopV2 (map is run for all elements)?

And will that order change in future versions of Java or there is an official documentation that guarantees us the order of map1 calls?

Example2:

Predicate<Integer> f1 = i -> i > 2;
Predicate<Integer> f2 = i -> i > 3;


public List<Integer> collectUsingStreams(List<Integer> ints){
    return ints.stream().filter(f1).filter(f2).collect( Collectors.toList() );
}

public List<Integer> collectUsingLoopV1(List<Integer> ints){
    List<Integer> result = new ArrayList<>();
    for (int i : ints){
        if ( f1.test(i) && f2.test(i) ) result.add(i);
    }
    return result;
}

public List<Integer> collectUsingLoopV2(List<Integer> ints){
    List<Integer> result = new ArrayList<>();
    for (int i : ints){
        if ( f2.test(i) && f1.test(i) ) result.add(i);
    }
    return result;
}

Again will the Java stream in collectUsingStreams method after collect is called run f1 and f2 in the order described in collectUsingLoopV1 (f1 is evaluated before f2) or as described in collectUsingLoopV2 (f2 is evaluated before f1)?

And will that order change in future versions of Java or there is an official documentation that guarantees us the order of f1 and f2 calls?

Edit

Thanks for all the answers and comment but unfortunately I still dont see good explanation on the ordering of processing the elements. The docs do say that the encounter order will be preserved for lists but they dont specify how those elements will be processed. For example in case of findFirst the docs guarantees that map1 will first see 1 then 2 but it does not say that map1 wont be executed for 4 and 5. Does it mean that we cant guarantee that our order of processing will be as we expect the in fure versions of Java? Probably yes.

like image 414
tsolakp Avatar asked Dec 22 '17 17:12

tsolakp


1 Answers

And will that order change in future versions of Java or there is an official documentation that guarantees us the order of map1 calls?

The javadocs, including package summaries (people often overlook those somehow), are the API contract. Behavior that is observable but not defined by the javadocs generally should be considered an implementation detail which may change in future versions.

So if it cannot be found in the javadocs then there is no guarantee.

In which order stream pipeline stages are called and interleaved is not specified. What is specified is under which circumstances the so-called encounter order of streams is preserved. Assuming an ordered stream an implementation is still allowed to perform any interleaving, batching and internal reordering that would preserve the encounter order. E.g. a sorted(comparator).filter(predicate).findFirst() could be internally replaced with a filter(predicate).min(comparator) which of course significantly affects the way in which both the Predicate and Comparator are invoked and yet yield the same results, even in an ordered stream.

Does it mean that we cant guarantee that our order of processing will be as we expect the in fure versions of Java? Probably yes.

Yes, and that should not be an issue since most of the stream APIs require callbacks to be stateless and free of side-effects, which among other things means they should not care about the internal execution order of the stream pipeline and the results should be identical, modulo the leeway granted by unordered streams.

The explicit requirements and absence of guarantees give the JDK developers flexibility how the streams are implemented.

If you have any particular case in mind where this would matter you should ask a more concrete question, about an execution reordering that you'd like to avoid.


You should always keep in mind that streams could be parallel, e.g. instances passed by 3rd-party code, or contain a source or intermediate stream operation that is less lazy than it could theoretically be (currently flatMap is such an operation). Stream pipelines can also contain custom behavior if someone extracts and rewraps a spliterator or uses a custom implementation of the Stream interface.

So while particular stream implementations might exhibit some predictable behavior when you use them in a specific way and future optimizations for that specific case could be considered highly unlikely this does not generalize to all possible stream pipelines and therefore the APIs can't provide such general guarantees.

like image 165
the8472 Avatar answered Sep 20 '22 00:09

the8472