I was wondering, whether there is a preferred way to get from a stream of lists to a collection containing the elements of all the lists in the stream. I can think of two ways to get there:
final Stream<List<Integer>> stream = Stream.empty();
final List<Integer> one = stream.collect(ArrayList::new, ArrayList::addAll, ArrayList::addAll);
final List<Integer> two = stream.flatMap(List::stream).collect(Collectors.toList());
The second option looks much nicer to me, but I guess the first one is more efficient in parallel streams. Are there further arguments for or against one of the two methods?
Create an empty list to collect the flattened elements. With the help of forEach loop, convert each elements of the list into stream and add it to the list. Now convert this list into stream using stream() method. Now flatten the stream by converting it into list using collect() method.
Convert Stream into List using List. stream() method.
Flattening is the process of converting several lists of lists and merge all those lists to create a single list containing all the elements from all the lists.
The main difference is that flatMap
is an intermediate operation. while collect
is a terminal operation.
So flatMap
is the only way to process the flattened stream items if you want to do other operations than collect
ing immediately.
Further collect(ArrayList::new, ArrayList::addAll, ArrayList::addAll)
is very hard to read given the fact that you have two identical method references ArrayList::addAll
with completely different semantics.
Regarding parallel processing, your guess is wrong. The first one has lesser capabilities of parallel processing as it relies on ArrayList.addAll
applied to the stream items (sub-lists) which can’t be broken into parallel sub-steps. In contrast, Collectors.toList()
applied to a flatMap
can do parallel processing of sub-list items if the particular List
s encountered in the stream support it. But this will be relevant only if you have a rather small stream of rather big sub-lists.
The only drawback of flatMap
is the intermediate stream creation which adds an overhead in the case that you have a lot of very small sub-lists.
But in your example, the stream is empty so it doesn’t matter (scnr).
I think the intent of option two is much clearer than that of option one. It took me a few seconds to work out what was happening with the first one, it doesn't look "right" - although it seems valid. Option two was more obvious to me.
Essentially, the intent of what you are doing is a flatmap. If that's the case I'd expect to see flatmap used rather than using addAll().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With