Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ConcurrentHashMap putIfAbsent : atomicity when followed by a get() call

I wanted to discuss a specific use I have of a concurrent map to sense check my logic...

If I used ConcurrentHashMap, I can do the familar

private final ConcurrentHashMap<K, V> map = new ConcurrentHashMap<K, V>();

public V getExampleOne(K key) {
    map.putIfAbsent(key, new Object());
    return map.get(key);
}

but I realise that a race condition exists whereby if I remove the item from the map between the putIfAbsent and the get, the method above would return something that no longer exists in the collection. This may or may not be fine, but lets assume that for my use case, it's not ok.

What I'd really like is to have the whole thing atomic. So,

public V getExampleTwo(K key) {
    return map.putIfAbsent(key, new Object());
}

but as this expands out to

if (!map.containsKey(key))
   return map.put(key, value);     [1]
return map.get(key);

which for line [1] will return null for first usage (ie, map.put will return the previous value, which for first time use is null).

I can't have it return null in this instance

Which leaves me with something like;

public V getExampleThree(K key) {
    Object object = new Object();
    V value = locks.putIfAbsent(key, object);
    if (value == null)
        return object;
    return value;
}

So, finally, my question; how do the examples above differ in semantics?. Does getExampleThree ensure atomicity like getExampleTwo but avoid the null return correctly? Are there other problems with getExampleThree?

I was hoping for a bit of discussion around the choices. I realise I could use a non ConcurrentHashMap and synchronize around clients calling my get method and a method to remove from the map but that seems to defeat the purpose (non blocking nature) of the ConcurrentHashMap. Is that my only choice to keep the data accurate?

I guess that's a bit part of why you'd choose ConcurrentHashMap; that its visible/up-to-date/acurrate at the point you interact with it, but there may be an impact further down the line if old data is going to be a problem...

like image 683
Toby Avatar asked Jan 10 '12 08:01

Toby


2 Answers

It sounds like you are trying to create a global lock object for a key.

Instead of deleting an entry with the possibility of have it re-created almost immediately, I would only delete the entry when you pretty sure its not needed ever again.

Otherwise, if you are using this for a lock, you can have two thread locking on different objects for the same key.


If its not ok, you can busy loop it.

public V getExampleOne(K key) {
    for(Object o = null, ret = null; (ret = map.get(key)) == null; )
        map.putIfAbsent(key, o == null ? o = new Object() : o);
    return ret;
}

it can still be removed or replaced as soon as the loop exists so its effectively much the same as.

public V getExampleThree(K key) {
    Object o = new Object();
    map.putIfAbsent(key, o);
    Object ret = map.get(key);
    return ret == null ? o : ret;
}

So, finally, my question; how do the examples above differ in semantics?.

The difference is only the obvious.

Does getExampleThree ensure atomicity like getExampleTwo but avoid the null return correctly?

Yes.

Are there other problems with getExampleThree?

Only if you believe the very next call might not give you a different value (if you believe it can be removed in another thread)

like image 176
Peter Lawrey Avatar answered Nov 08 '22 12:11

Peter Lawrey


The methods have different semantics:

  • getExampleOne is not atomic.
  • getExampleTwo returns null if the new object was inserted into the map. This differs from the behavior of getExampleOne, but it is atomic.
  • getExampleThree is probably what you want. It is atomic and it return the object that is in the map after the point in time of the putIfAbsent call. But it was problem when nulls are valid values in your application. The null return value is then ambiguous.

However, depending of the situation it might not be the actual object at the point in time when you use the return value. You then need explicit locking.

like image 3
dmeister Avatar answered Nov 08 '22 12:11

dmeister