Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java 8 stream short-circuit

Reading up a bit on Java 8, I got to this blog post explaining a bit about streams and reduction of them, and when it would be possible to short-circuit the reduction. At the bottom it states:

Note in the case of findFirst or findAny we only need the first value which matches the predicate (although findAny is not guaranteed to return the first). However if the stream has no ordering then we’d expect findFirst to behave like findAny. The operations allMatch, noneMatch and anyMatch may not short-circuit the stream at all since it may take evaluating all the values to determine whether the operator is true or false. Thus an infinite stream using these may not terminate.

I get that findFirst or findAny may short-circuit the reduction, because as soon af you find an element, you don't need to process any further.

But why would this not be possible for allMatch, noneMatch and anyMatch? For allMatch, if you find one which doesn't match the predicate, you can stop processing. Same for none. And anyMatch especially doesn't make sense to me, as it it pretty much equal to findAny (except for what is returned)?

Saying that these three may not short-circuit, because it may take evaluating all the values, could also be said for findFirst/Any.

Is there some fundamental difference I'm missing? Am I not really understanding what is going on?

like image 631
Koekje Avatar asked Aug 24 '15 22:08

Koekje


People also ask

What is short-circuiting operation on a stream?

A terminal operation is short-circuiting if, when presented with infinite input, it may terminate in finite time. Having a short-circuiting operation in the pipeline is a necessary, but not sufficient, condition for the processing of an infinite stream to terminate normally in finite time.

Which of the following is a short circuit in intermediate operations in Java stream?

For example limit() and skip() are two short circuiting intermediate operations. A terminal operation is called short circuiting, if it may terminate in finite time for infinite stream. For example anyMatch , allMatch , noneMatch , findFirst and findAny are short circuiting terminal operations.

Why is Java stream lazy?

Streams are lazy because intermediate operations are not evaluated unless terminal operation is invoked. Each intermediate operation creates a new stream, stores the provided operation/function and return the new stream.

Does allMatch short circuit?

The operations allMatch , noneMatch and anyMatch may not short-circuit the stream at all since it may take evaluating all the values to determine whether the operator is true or false . Thus an infinite stream using these may not terminate.


2 Answers

There's a subtle difference, because anyMatch family uses a predicate, while findAny family does not. Technically findAny() looks like anyMatch(x -> true) and anyMatch(pred) looks like filter(pred).findAny(). So here we have another issue. Consider we have a simple infinite stream:

Stream<Integer> s = Stream.generate(() -> 1); 

So it's true that applying findAny() to such stream will always short-circuit and finish while applying anyMatch(pred) depends on the predicate. However let's filter our infinite stream:

Stream<Integer> s = Stream.generate(() -> 1).filter(x -> x < 0); 

Is the resulting stream infinite as well? That's a tricky question. It actually contains no elements, but to determine this (for example, using .iterator().hasNext()) we have to check the infinite number of underlying stream elements, so this operation will never finish. I would call such stream an infinite as well. However using such stream both anyMatch and findAny will never finish:

Stream.generate(() -> 1).filter(x -> x < 0).anyMatch(x -> true); Stream.generate(() -> 1).filter(x -> x < 0).findAny(); 

So findAny() is not guaranteed to finish either, it depends on the previous intermediate stream operations.

To conclude I would rate that blog-post as very misleading. In my opinion infinity stream behavior is better explained in official JavaDoc.

like image 125
Tagir Valeev Avatar answered Sep 24 '22 11:09

Tagir Valeev


Answer Updated

I'd say the blog post is wrong when it says "findFirst or findAny we only need the first value which matches the predicate".

In the javadoc for allMatch(Predicate), anyMatch(Predicate), noneMatch(Predicate), findAny(), and findFirst():

This is a short-circuiting terminal operation.

However, note that findFirst and findAny doesn't have a Predicate. So they can both return immediately upon seeing the first/any value. The other 3 are conditional and may loop forever if condition never fires.

like image 29
Andreas Avatar answered Sep 22 '22 11:09

Andreas