Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between Collectors.toConcurrentMap and converting a Map to ConcurrentHashMap via Collectors.toMap supplier option?

I want to convert a Map into a ConcurrentHashMap via Java 8 Stream and Collector interface, and there are two options I can use.

The first:

Map<Integer, String> mb = persons.stream()
                                 .collect(Collectors.toMap(
                                            p -> p.age, 
                                            p -> p.name, 
                                            (name1, name2) -> name1+";"+name2,
                                            ConcurrentHashMap::new));

And the second:

Map<Integer, String> mb1 = persons.stream()
                                  .collect(Collectors.toConcurrentMap(
                                             p -> p.age, 
                                             p -> p.name));

Which one is the better option? When should I use each option?

like image 998
KayV Avatar asked Nov 30 '16 12:11

KayV


2 Answers

There is a difference between them when dealing with parallel streams.

toMap -> is a non-concurrent collector

toConcurrentMap -> is a concurrent collector (this can be seen from their characteristics).

The difference is that toMap will create multiple intermediate results and then will merge then together (the Supplier of such a Collector will be called multiple times), while toConcurrentMap will create a single result and each Thread will throw results at it (the Supplier of such a Collector will be called only once)

Why is this important? This deals with insertion order (if that matters).

toMap will insert values in the resulting Map in encounter order by merging multiple intermediate results (Supplier of that collector is called multiple time as well as the Combiner)

toConcurrentMap will collect elements in any order (undefined) by throwing all elements at a common result container (ConcurrentHashMap in this case). Supplier is called only once, Accumulator many times and Combiner never.

The small caveat here is that for a CONCURRENT collector to not invoke the merger: either the stream has to have the UNORDERED flag - either via the unordered() explicit call or when the source of the stream is not ordered (a Set for example).

like image 128
Eugene Avatar answered Nov 15 '22 19:11

Eugene


From toMap's Javadoc :

The returned Collector is not concurrent. For parallel stream pipelines, the combiner function operates by merging the keys from one map into another, which can be an expensive operation. If it is not required that results are inserted into the Map in encounter order, using toConcurrentMap(Function, Function) may offer better parallel performance.

toConcurrentMap doesn't insert the results into the Map in encounter order, but supposed to give better performance.

If you don't care about the insertion order, it is recommended to use toConcurrentMap if you are using a parallel stream.

like image 23
Eran Avatar answered Nov 15 '22 20:11

Eran