Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does the perfomance of "filter then map" and "map then filter" differ in a Stream?

I would like to know what is faster: to filter a custom object by field and then map by its field or vice-versa (map and then filter).
At the end, I usually want to collect the mapped field into some Collection.

For example, the simplest Person class:

public class Person {
    String uuid;
    String name;
    String secondName;
}

Now let's have a List<Person> persons.

List<String> filtered1 = persons
                .stream()
                .filter(p -> "NEED_TOY".equals(p.getName()))
                .map(Person::getName)
                .collect(Collectors.toList());
// or?
List<String> filtered2 = persons
                .stream()
                .map(Person::getName)
                .filter(p -> "NEED_TOY".equals(p))
                .collect(Collectors.toList());
like image 771
keyzj Avatar asked Aug 17 '19 12:08

keyzj


People also ask

What is difference between map and filter in Java Stream?

Filter takes a predicate as an argument so basically you are validating your input/collection against a condition, whereas a map allows you to define or use a existing function on the stream eg you can apply String.

Should I filter before map?

If your map function needs to performs some complex operation such as calling some external REST api to manipulate the stream objects, then in this scenario I recommend to filter first before map since it will reduce the no of unwanted expensive REST calls.

How does filter work in streams?

filter() is a intermediate Stream operation. It returns a Stream consisting of the elements of the given stream that match the given predicate. The filter() argument should be stateless predicate which is applied to each element in the stream to determine if it should be included or not.

What is the purpose of the map () method in the stream T interface?

The map() function is a method in the Stream class that represents a functional programming concept. In simple words, the map() is used to transform one object into another by applying a function. That's the reason the Stream.


1 Answers

In this specific example, where calling Person.getName() has basically no cost at all, it doesn't matter, and you should use what you find the most readable (and filtering after could even be marginally faster, since as TJ mentions, the mapping operation is part of the filtering operation).

If the mapping operation has a significant cost however, then filtering first (if possible) is more efficient, since the stream won't have to map the elements that have been filtered out.

Let's take a contrived example: you have a stream of IDs, and for every even ID in the stream, you have to execute an http GET request or a database query to get the details of the item identified by this ID (and thus mapping the ID to a detailed object).

Assuming that the stream is composed of half even and half odd IDs, and each request takes the same time, you would divide the time by two by filtering first. If every http request takes 1 second and you have 60 IDs, you would go from 60 seconds to 30 seconds for the same task by filtering first, and you would also reduce the charge on the network and the external http API.

like image 194
JB Nizet Avatar answered Oct 14 '22 00:10

JB Nizet