Given I have a stream Stream<T> stream = list.stream().filter(some predicate)
where the list is very large, is it more efficient to check whether the stream is non-empty by doing: stream.count() > 0
or by doing: stream.findFirst().isPresent()
?
If all you want to know, is whether there is a match, you should uselist.stream().anyMatch(some predicate)
, not only because it is more efficient, but also, because it is the correct idiom for expressing you intention.
As said by others, anyMatch
is short-circuiting, which implies that it will stop at the first match, whereas count
will, as the name suggests, count all matches before returning. Depending on the stream contents, this can make a huge performance difference. But note, that you can make count
similarly efficient, by using
list.stream().filter(some predicate).limit(1).count() > 0
Then, it will also stop after the first occurrence, but, as said, anyMatch
is still the preferred way to express that you are interested in whether there is any match. Things change, when the task is to find out whether there are at least n
matches. Then, .limit(n).count() > n-1
(or >= n
) becomes the natural idiom.
Note that findFirst()
differs from the other solution because its answer depends on the ordering. So if you only want to know whether there is a match, you should use findAny()
instead. Still, there is a theoretical difference due to the requirement of returning the matching value, compared to just telling whether there is a match, like anyMatch
does, though currently that difference lies only in the construction of an Optional
instance, hence is negligible.
But since you are programming against an API to encode your intention, you should not use find…
when you only want to know whether there is a match. anyMatch
clearly expresses your intention and might have an even higher benefit in future implementations or more complex scenarios.
I would recommend using list.stream().anyMatch(some predicate)
, which is a terminal operation for exactly this case. It's not only more efficient than stream.count()
, but it won't hang on infinite streams.
findAny
(which is preferable to findFirst
if you do not need ordering) and anyMatch
are short-circuiting operations, which means they can return early without consuming the entire stream if the conditions allow. This is mentioned and linked in their method javadocs. count()
is not.
If the stream's final stage still uses a spliterator with the SIZED characteristic then count()
may be as fast as the other two options. But this is a far weaker property than short-circuiting since intermediate stream operations – such as filter()
– are very likely to discard the SIZED aspect.
All this information can be gleaned from the package documentation, it's highly recommended reading.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With