Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Choosing between Stream and Collections API

Consider the following example that prints the maximum element in a List :

List<Integer> list = Arrays.asList(1,4,3,9,7,4,8);           
list.stream().max(Comparator.naturalOrder()).ifPresent(System.out::println);

The same objective can also be achieved using the Collections.max method :

System.out.println(Collections.max(list));

The above code is not only shorter but also cleaner to read (in my opinion). There are similar examples that come to mind such as the use of binarySearch vs filter used in conjunction with findAny.

I understand that Stream can be an infinite pipeline as opposed to a Collection that is limited by the memory available to the JVM. This would be my criteria for deciding whether to use a Stream or the Collections API. Are there any other reasons for choosing Stream over the Collections API (such as performance). More generally, is this the only reason to chose Stream over older API that can do the job in a cleaner and shorter way?

like image 976
Chetan Kinger Avatar asked Jun 10 '15 18:06

Chetan Kinger


People also ask

What are the advantages of stream API over collections API?

The stream API allows you to perform operations on collections without external iteration. In this case, we're performing a filter operation which will filter the input collection based on the condition specified.

How streams are better than collections?

Streams are not modifiable i.e one can't add or remove elements from streams. These are modifiable i.e one can easily add to or remove elements from collections. Streams are iterated internally by just mentioning the operations. Collections are iterated externally using loops.

How is collection API different from stream API?

Differences between a Stream and a Collection: A stream does not store data. An operation on a stream does not modify its source, but simply produces a result. Collections have a finite size, but streams do not.

What is the primary difference between a collection & a stream?

A collection is an in-memory data structure, which holds all the values that the data structure currently has—every element in the collection has to be computed before it can be added to the collection. In contrast, a stream is a conceptually fixed data structure in which elements are computed on demand.


1 Answers

Stream API is like a Swiss Army knife: it allows you to do quite complex operations by combining the tools effectively. On the other hand if you just need a screwdriver, probably the standalone screwdriver would be more convenient. Stream API includes many things (like distinct, sorted, primitive operations etc.) which otherwise would require you to write several lines and introduce intermediate variables/data structures and boring loops drawing the programmer attention from the actual algorithm. Sometimes using the Stream API can improve the performance even for sequential code. For example, consider some old API:

class Group {
    private Map<String, User> users;

    public List<User> getUsers() {
        return new ArrayList<>(users.values());
    }
}

Here we want to return all the users of the group. The API designer decided to return a List. But it can be used outside in a various ways:

List<User> users = group.getUsers();
Collections.sort(users);
someOtherMethod(users.toArray(new User[users.size]));

Here it's sorted and converted to array to pass to some other method which happened to accept an array. In the other place getUsers() may be used like this:

List<User> users = group.getUsers();
for(User user : users) {
    if(user.getAge() < 18) {
        throw new IllegalStateException("Underage user in selected group!");
    }
}

Here we just want to find the user matched some criteria. In both cases copying to intermediate ArrayList was actually unnecessary. When we move to Java 8, we can replace getUsers() method with users():

public Stream<User> users() {
    return users.values().stream();
}

And modify the caller code. The first one:

someOtherMethod(group.users().sorted().toArray(User[]::new));

The second one:

if(group.users().anyMatch(user -> user.getAge() < 18)) {
    throw new IllegalStateException("Underage user in selected group!");
}

This way it's not only shorter, but may work faster as well, because we skip the intermediate copying.

The other conceptual point in Stream API is that any stream code written according to the guidelines can be parallelized simply by adding the parallel() step. Of course this will not always boost the performance, but it helps more often than I expected. Usually if the operation executed sequentially for 0.1ms or longer, it can benefit from the parallelization. Anyways we haven't seen such simple way to do the parallel programming in Java before.

like image 154
Tagir Valeev Avatar answered Oct 03 '22 05:10

Tagir Valeev