Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does ConcurrentHashMap prevent null keys and values?

People also ask

Why HashTable does not allow null keys or values?

Now you must be wondering why HashTable doesn't allow null and HashMap do? The answer is simple. In order to successfully store and retrieve objects from a HashTable, the objects used as keys must implement the hashCode method and the equals method. Since null is not an object, it can't implement these methods.

Does ConcurrentHashMap allow duplicate keys?

Summary of ConcurrentHashMap features – It does not allow duplicate keys. – It does not allow null to be used as a key or value.

How does locking happen in ConcurrentHashMap?

In ConcurrentHashMap, at a time any number of threads can perform retrieval operation but for updated in the object, the thread must lock the particular segment in which the thread wants to operate. This type of locking mechanism is known as Segment locking or bucket locking.

Do we have lock while getting value from ConcurrentHashMap?

ConcurrentHashMap: It allows concurrent access to the map. Part of the map called Segment (internal data structure) is only getting locked while adding or updating the map. So ConcurrentHashMap allows concurrent threads to read the value without locking at all.


From the author of ConcurrentHashMap himself (Doug Lea):

The main reason that nulls aren't allowed in ConcurrentMaps (ConcurrentHashMaps, ConcurrentSkipListMaps) is that ambiguities that may be just barely tolerable in non-concurrent maps can't be accommodated. The main one is that if map.get(key) returns null, you can't detect whether the key explicitly maps to null vs the key isn't mapped. In a non-concurrent map, you can check this via map.contains(key), but in a concurrent one, the map might have changed between calls.


I believe it is, at least in part, to allow you to combine containsKey and get into a single call. If the map can hold nulls, there is no way to tell if get is returning a null because there was no key for that value, or just because the value was null.

Why is that a problem? Because there is no safe way to do that yourself. Take the following code:

if (m.containsKey(k)) {
   return m.get(k);
} else {
   throw new KeyNotPresentException();
}

Since m is a concurrent map, key k may be deleted between the containsKey and get calls, causing this snippet to return a null that was never in the table, rather than the desired KeyNotPresentException.

Normally you would solve that by synchronizing, but with a concurrent map that of course won't work. Hence the signature for get had to change, and the only way to do that in a backwards-compatible way was to prevent the user inserting null values in the first place, and continue using that as a placeholder for "key not found".


Josh Bloch designed HashMap; Doug Lea designed ConcurrentHashMap. I hope that isn't libelous. Actually I think the problem is that nulls often require wrapping so that the real null can stand for uninitialized. If client code requires nulls then it can pay the (admittedly small) cost of wrapping nulls itself.


You can't synchronize on a null.

Edit: This isn't exactly why in this case. I initially thought there was something fancy going on with locking things against concurrent updates or otherwise using the Object monitor to detect if something was modified, but upon examining the source code it appears I was wrong - they lock using a "segment" based on a bitmask of the hash.

In that case, I suspect they did it to copy Hashtable, and I suspect Hashtable did it because in the relational database world, null != null, so using a null as a key has no meaning.