Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What makes Hashmap.putIfAbsent faster than containsKey followed by put?

Question

How is the HashMap method putIfAbsent able to perform a put conditionally in a way thats faster than calling containsKey(x) prior?

For example, if you didn't use putIfAbsent you could use:

 if(!map.containsKey(x)){ 
   map.put(x,someValue); 
}

I had previously thought putIfAbsent was convenience method for calling containsKey followed by a put on a HashMap. But after running a benchmark putIfAbsent is significantly faster than using containsKey followed by Put. I looked at the java.util source code to try and see how this is possible but it's a bit too cryptic for me to figure out. Does anyone know internally how putIfAbsent seems to work in a better time complexity? Thats my assumption based on running a few code tests in which my code ran 50% faster when using putIfAbsent. It seems to avoid calling a get() but how?

Example

if(!map.containsKey(x)){
     map.put(x,someValue);
}

VS

map.putIfAbsent(x,somevalue)

Java Source Code for Hashmap.putIfAbsent

@Override
public V putIfAbsent(K key, V value) {
    return putVal(hash(key), key, value, true, true);
}

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}
like image 314
Usman Mutawakil Avatar asked Sep 26 '18 06:09

Usman Mutawakil


People also ask

What is the time complexity of HashMap containsKey?

Generally O(1), but if we're using a bad hashCode function, we need to add multiple elements to one bucket so it can be O(n) in worst case.

What is the difference between putIfAbsent and computeIfAbsent?

Both functions aspire to add an element if the specified Key is not already present in Map. putIfAbsent adds an element with the specified Value whereas computeIfAbsent adds an element with the value computed using the Key.

How do you optimize a map in Java?

Starting from Java 8, one optimization is built-in in HashMap: When buckets are getting too large, they're transformed into trees, instead of linked lists. That brings the pessimistic time of O(n) to O(log(n)), which is much better. For that to work, the keys of HashMap need to implement the Comparable interface.

What is putIfAbsent in Java?

The putIfAbsent(Key, value) method of Hashtable class which allows to map a value to a given key if given key is not associated with a value or mapped to null. A null value is returned if such key-value set is already present in the HashMap.


2 Answers

The HashMap implementation of putIfAbsent searches for the key just once, and if it doesn't find the key, it puts the value in the relevant bin (which was already located). That's what putVal does.

On the other hand, using map.containsKey(x) followed by map.put(x,someValue) performs two lookups for the key in the Map, which takes more time.

Note that put also calls putVal (put calls putVal(hash(key), key, value, false, true) while putIfAbsent calls putVal(hash(key), key, value, true, true)), so putIfAbsent has the same performance as calling just put, which is faster than calling both containsKey and put.

like image 133
Eran Avatar answered Oct 18 '22 19:10

Eran


See Eran's answer... I'd like to also answer it more succinctly. put and putIfAbsent both use the same helper method putVal. But clients using put can't take advantage of its many parameters that allow put-if-present behavior. The public method putIfAbsent exposes this. So using putIfAbsent has the same underlying time complexity as the put you are already going to use in conjunction with containsKey. The use of containsKey then becomes a waste.

So the core of this is that private function putVal is being used by both put and putIfAbsent.

like image 30
Usman Mutawakil Avatar answered Oct 18 '22 20:10

Usman Mutawakil