Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a preferred way collect a stream of lists into a flat list?

I was wondering, whether there is a preferred way to get from a stream of lists to a collection containing the elements of all the lists in the stream. I can think of two ways to get there:

final Stream<List<Integer>> stream = Stream.empty();
final List<Integer> one = stream.collect(ArrayList::new, ArrayList::addAll, ArrayList::addAll);
final List<Integer> two = stream.flatMap(List::stream).collect(Collectors.toList());

The second option looks much nicer to me, but I guess the first one is more efficient in parallel streams. Are there further arguments for or against one of the two methods?

like image 327
muued Avatar asked Sep 02 '14 14:09

muued


People also ask

How do you flatten a stream list?

Create an empty list to collect the flattened elements. With the help of forEach loop, convert each elements of the list into stream and add it to the list. Now convert this list into stream using stream() method. Now flatten the stream by converting it into list using collect() method.

Which method is used to convert stream to list?

Convert Stream into List using List. stream() method.

What is stream flattening?

Flattening is the process of converting several lists of lists and merge all those lists to create a single list containing all the elements from all the lists.


2 Answers

The main difference is that flatMap is an intermediate operation. while collect is a terminal operation.

So flatMap is the only way to process the flattened stream items if you want to do other operations than collecting immediately.

Further collect(ArrayList::new, ArrayList::addAll, ArrayList::addAll) is very hard to read given the fact that you have two identical method references ArrayList::addAll with completely different semantics.

Regarding parallel processing, your guess is wrong. The first one has lesser capabilities of parallel processing as it relies on ArrayList.addAll applied to the stream items (sub-lists) which can’t be broken into parallel sub-steps. In contrast, Collectors.toList() applied to a flatMap can do parallel processing of sub-list items if the particular Lists encountered in the stream support it. But this will be relevant only if you have a rather small stream of rather big sub-lists.

The only drawback of flatMap is the intermediate stream creation which adds an overhead in the case that you have a lot of very small sub-lists.

But in your example, the stream is empty so it doesn’t matter (scnr).

like image 88
Holger Avatar answered Nov 15 '22 19:11

Holger


I think the intent of option two is much clearer than that of option one. It took me a few seconds to work out what was happening with the first one, it doesn't look "right" - although it seems valid. Option two was more obvious to me.

Essentially, the intent of what you are doing is a flatmap. If that's the case I'd expect to see flatmap used rather than using addAll().

like image 40
Ian Fairman Avatar answered Nov 15 '22 18:11

Ian Fairman