Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ConcurrentHashMap: avoid extra object creation with "putIfAbsent"?

I am aggregating multiple values for keys in a multi-threaded environment. The keys are not known in advance. I thought I would do something like this:

class Aggregator {     protected ConcurrentHashMap<String, List<String>> entries =                             new ConcurrentHashMap<String, List<String>>();     public Aggregator() {}      public void record(String key, String value) {         List<String> newList =                     Collections.synchronizedList(new ArrayList<String>());         List<String> existingList = entries.putIfAbsent(key, newList);         List<String> values = existingList == null ? newList : existingList;         values.add(value);     } } 

The problem I see is that every time this method runs, I need to create a new instance of an ArrayList, which I then throw away (in most cases). This seems like unjustified abuse of the garbage collector. Is there a better, thread-safe way of initializing this kind of a structure without having to synchronize the record method? I am somewhat surprised by the decision to have the putIfAbsent method not return the newly-created element, and by the lack of a way to defer instantiation unless it is called for (so to speak).

like image 471
Gene Golovchinsky Avatar asked May 24 '12 18:05

Gene Golovchinsky


2 Answers

Java 8 introduced an API to cater for this exact problem, making a 1-line solution:

public void record(String key, String value) {     entries.computeIfAbsent(key, k -> Collections.synchronizedList(new ArrayList<String>())).add(value); } 

For Java 7:

public void record(String key, String value) {     List<String> values = entries.get(key);     if (values == null) {         entries.putIfAbsent(key, Collections.synchronizedList(new ArrayList<String>()));         // At this point, there will definitely be a list for the key.         // We don't know or care which thread's new object is in there, so:         values = entries.get(key);     }     values.add(value); } 

This is the standard code pattern when populating a ConcurrentHashMap.

The special method putIfAbsent(K, V)) will either put your value object in, or if another thread got before you, then it will ignore your value object. Either way, after the call to putIfAbsent(K, V)), get(key) is guaranteed to be consistent between threads and therefore the above code is threadsafe.

The only wasted overhead is if some other thread adds a new entry at the same time for the same key: You may end up throwing away the newly created value, but that only happens if there is not already an entry and there's a race that your thread loses, which would typically be rare.

like image 79
Bohemian Avatar answered Sep 20 '22 06:09

Bohemian


As of Java-8 you can create Multi Maps using the following pattern:

public void record(String key, String value) { entries.computeIfAbsent(key, k -> Collections.synchronizedList(new ArrayList<String>())) .add(value); }

The ConcurrentHashMap documentation (not the general contract) specifies that the ArrayList will only be created once for each key, at the slight initial cost of delaying updates while the ArrayList is being created for a new key:

http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html#computeIfAbsent-K-java.util.function.Function-

like image 33
Peter Avatar answered Sep 19 '22 06:09

Peter