I just came across a question when using a List
and its stream()
method. While I know how to use them, I'm not quite sure about when to use them.
For example, I have a list, containing various paths to different locations. Now, I'd like to check whether a single, given path contains any of the paths specified in the list. I'd like to return a boolean
based on whether or not the condition was met.
This of course, is not a hard task per se. But I wonder whether I should use streams, or a for(-each) loop.
The List
private static final List<String> EXCLUDE_PATHS = Arrays.asList( "my/path/one", "my/path/two" );
Example using Stream:
private boolean isExcluded(String path) { return EXCLUDE_PATHS.stream() .map(String::toLowerCase) .filter(path::contains) .collect(Collectors.toList()) .size() > 0; }
Example using for-each loop:
private boolean isExcluded(String path){ for (String excludePath : EXCLUDE_PATHS) { if (path.contains(excludePath.toLowerCase())) { return true; } } return false; }
Note that the path
parameter is always lowercase.
My first guess is that the for-each approach is faster, because the loop would return immediately, if the condition is met. Whereas the stream would still loop over all list entries in order to complete filtering.
Is my assumption correct? If so, why (or rather when) would I use stream()
then?
Streams provide many benefits to humans. Besides providing drinking water and irrigation for crops, streams wash away waste and can provide electricity through hydropower. People often use streams recreationally for activities such as swimming, fishing, and boating. Streams also provide important habitat for wildlife.
Conclusion: If you have a small list; for loops perform better, if you have a huge list; a parallel stream will perform better. And since parallel streams have quite a bit of overhead, it is not advised to use these unless you are sure it is worth the overhead.
Advantages of the streams:Your stream-handling code doesn't need to know the source of the stream or its eventual terminating method. Streams can succinctly express quite sophisticated behavior. Streams can be a replacement for looping because they allow for the processing of a sequence of data (similarly to a loop).
Your assumption is correct. Your stream implementation is slower than the for-loop.
This stream usage should be as fast as the for-loop though:
EXCLUDE_PATHS.stream() .map(String::toLowerCase) .anyMatch(path::contains);
This iterates through the items, applying String::toLowerCase
and the filter to the items one-by-one and terminating at the first item that matches.
Both collect()
& anyMatch()
are terminal operations. anyMatch()
exits at the first found item, though, while collect()
requires all items to be processed.
The decision whether to use Streams or not should not be driven by performance consideration, but rather by readability. When it really comes to performance, there are other considerations.
With your .filter(path::contains).collect(Collectors.toList()).size() > 0
approach, you are processing all elements and collecting them into a temporary List
, before comparing the size, still, this hardly ever matters for a Stream consisting of two elements.
Using .map(String::toLowerCase).anyMatch(path::contains)
can save CPU cycles and memory, if you have a substantially larger number of elements. Still, this converts each String
to its lowercase representation, until a match is found. Obviously, there is a point in using
private static final List<String> EXCLUDE_PATHS = Stream.of("my/path/one", "my/path/two").map(String::toLowerCase) .collect(Collectors.toList()); private boolean isExcluded(String path) { return EXCLUDE_PATHS.stream().anyMatch(path::contains); }
instead. So you don’t have to repeat the conversion to lowcase in every invocation of isExcluded
. If the number of elements in EXCLUDE_PATHS
or the lengths of the strings becomes really large, you may consider using
private static final List<Predicate<String>> EXCLUDE_PATHS = Stream.of("my/path/one", "my/path/two").map(String::toLowerCase) .map(s -> Pattern.compile(s, Pattern.LITERAL).asPredicate()) .collect(Collectors.toList()); private boolean isExcluded(String path){ return EXCLUDE_PATHS.stream().anyMatch(p -> p.test(path)); }
Compiling a string as regex pattern with the LITERAL
flag, makes it behave just like ordinary string operations, but allows the engine to spent some time in preparation, e.g. using the Boyer Moore algorithm, to be more efficient when it comes to the actual comparison.
Of course, this only pays off if there are enough subsequent tests to compensate the time spent in preparation. Determining whether this will be the case, is one of the actual performance considerations, besides the first question whether this operation will ever be performance critical at all. Not the question whether to use Streams or for
loops.
By the way, the code examples above keep the logic of your original code, which looks questionable to me. Your isExcluded
method returns true
, if the specified path contains any of the elements in list, so it returns true
for /some/prefix/to/my/path/one
, as well as my/path/one/and/some/suffix
or even /some/prefix/to/my/path/one/and/some/suffix
.
Even dummy/path/onerous
is considered fulfilling the criteria as it contains
the string my/path/one
…
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With