Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When should I use streams?

I just came across a question when using a List and its stream() method. While I know how to use them, I'm not quite sure about when to use them.

For example, I have a list, containing various paths to different locations. Now, I'd like to check whether a single, given path contains any of the paths specified in the list. I'd like to return a boolean based on whether or not the condition was met.

This of course, is not a hard task per se. But I wonder whether I should use streams, or a for(-each) loop.

The List

private static final List<String> EXCLUDE_PATHS = Arrays.asList(     "my/path/one",     "my/path/two" ); 

Example using Stream:

private boolean isExcluded(String path) {     return EXCLUDE_PATHS.stream()                         .map(String::toLowerCase)                         .filter(path::contains)                         .collect(Collectors.toList())                         .size() > 0; } 

Example using for-each loop:

private boolean isExcluded(String path){     for (String excludePath : EXCLUDE_PATHS) {         if (path.contains(excludePath.toLowerCase())) {             return true;         }     }     return false; } 

Note that the path parameter is always lowercase.

My first guess is that the for-each approach is faster, because the loop would return immediately, if the condition is met. Whereas the stream would still loop over all list entries in order to complete filtering.

Is my assumption correct? If so, why (or rather when) would I use stream() then?

like image 884
mcuenez Avatar asked Feb 27 '17 12:02

mcuenez


People also ask

What can streams be used for?

Streams provide many benefits to humans. Besides providing drinking water and irrigation for crops, streams wash away waste and can provide electricity through hydropower. People often use streams recreationally for activities such as swimming, fishing, and boating. Streams also provide important habitat for wildlife.

Should I use stream or for loop?

Conclusion: If you have a small list; for loops perform better, if you have a huge list; a parallel stream will perform better. And since parallel streams have quite a bit of overhead, it is not advised to use these unless you are sure it is worth the overhead.

Why are streams better than for loops?

Advantages of the streams:Your stream-handling code doesn't need to know the source of the stream or its eventual terminating method. Streams can succinctly express quite sophisticated behavior. Streams can be a replacement for looping because they allow for the processing of a sequence of data (similarly to a loop).


2 Answers

Your assumption is correct. Your stream implementation is slower than the for-loop.

This stream usage should be as fast as the for-loop though:

EXCLUDE_PATHS.stream()       .map(String::toLowerCase)     .anyMatch(path::contains); 

This iterates through the items, applying String::toLowerCase and the filter to the items one-by-one and terminating at the first item that matches.

Both collect() & anyMatch() are terminal operations. anyMatch() exits at the first found item, though, while collect() requires all items to be processed.

like image 185
Stefan Pries Avatar answered Oct 09 '22 17:10

Stefan Pries


The decision whether to use Streams or not should not be driven by performance consideration, but rather by readability. When it really comes to performance, there are other considerations.

With your .filter(path::contains).collect(Collectors.toList()).size() > 0 approach, you are processing all elements and collecting them into a temporary List, before comparing the size, still, this hardly ever matters for a Stream consisting of two elements.

Using .map(String::toLowerCase).anyMatch(path::contains) can save CPU cycles and memory, if you have a substantially larger number of elements. Still, this converts each String to its lowercase representation, until a match is found. Obviously, there is a point in using

private static final List<String> EXCLUDE_PATHS =     Stream.of("my/path/one", "my/path/two").map(String::toLowerCase)           .collect(Collectors.toList());  private boolean isExcluded(String path) {     return EXCLUDE_PATHS.stream().anyMatch(path::contains); } 

instead. So you don’t have to repeat the conversion to lowcase in every invocation of isExcluded. If the number of elements in EXCLUDE_PATHS or the lengths of the strings becomes really large, you may consider using

private static final List<Predicate<String>> EXCLUDE_PATHS =     Stream.of("my/path/one", "my/path/two").map(String::toLowerCase)           .map(s -> Pattern.compile(s, Pattern.LITERAL).asPredicate())           .collect(Collectors.toList());  private boolean isExcluded(String path){     return EXCLUDE_PATHS.stream().anyMatch(p -> p.test(path)); } 

Compiling a string as regex pattern with the LITERAL flag, makes it behave just like ordinary string operations, but allows the engine to spent some time in preparation, e.g. using the Boyer Moore algorithm, to be more efficient when it comes to the actual comparison.

Of course, this only pays off if there are enough subsequent tests to compensate the time spent in preparation. Determining whether this will be the case, is one of the actual performance considerations, besides the first question whether this operation will ever be performance critical at all. Not the question whether to use Streams or for loops.

By the way, the code examples above keep the logic of your original code, which looks questionable to me. Your isExcluded method returns true, if the specified path contains any of the elements in list, so it returns true for /some/prefix/to/my/path/one, as well as my/path/one/and/some/suffix or even /some/prefix/to/my/path/one/and/some/suffix.

Even dummy/path/onerous is considered fulfilling the criteria as it contains the string my/path/one

like image 39
Holger Avatar answered Oct 09 '22 16:10

Holger