Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is a HashMap thread-safe for different keys?

If I have two multiple threads accessing a HashMap, but guarantee that they'll never be accessing the same key at the same time, could that still lead to a race condition?

like image 480
agentofuser Avatar asked Apr 22 '10 06:04

agentofuser


People also ask

Can different keys have same value in HashMap?

You should be able to make a copy of a HashMap using its clone method. Note that while this does get you two different maps, the actual values in the two maps are the same. This means that if the copied map's contents are mutable and you change them, they will still change in both places.

Can we use HashMap in a multi-threaded environment?

It is a problem if multiple threads are adding to the same HashMap instance without it being synchronized . Even if just 1 thread is modifying a HashMap and other threads are reading from that same map without synchronization, you will run into problems.

Why is HashMap thread unsafe?

"Because the hashmap will perform capacity expansion operations" is not only reason why HashMap is not thread safe. You have to refer to Java Memory Model to understand what guarantee it can offer. One of such guarantee is visibility.

Is HashMap thread-safe for read only?

As long as no thread is modifying the map (adding or removing a key/value pair, or mutating an existing value) while other threads are reading then it is thread-safe.


2 Answers

In @dotsid's answer he says this:

If you change a HashMap in any way then your code is simply broken.

He is correct. A HashMap that is updated without synchronization will break even if the threads are using disjoint sets of keys. Here are just some1 of the things that can go wrong.

  • If one thread does a put, then another thread may see a stale value for the hashmap's size.

  • If one thread does a put with a key that is (currently) in the same hash bucket as the second thread's key, second thread's map entry might get lost, temporarily or permanently. It depends on how the hash chains (or whatever) are implemented.

  • When a thread does a put that triggers a rebuild of the table, another thread may see transient or stale versions of the hashtable array reference, its size, its contents or the hash chains. Chaos may ensue.

  • When a thread does a put for a key that collides with some key used by some other thread, and the latter thread does a put for its key, then the latter might see a stale copy of hash chain reference. Chaos may ensue.

  • When one thread probes the table with a key that collides with one of some other thread's keys, it may encounter that key on the chain. It will call equals on that key, and if the threads are not synchronized, the equals method may encounter stale state in that key.

And if you have two threads simultaneously doing put or remove requests, there are numerous opportunities for race conditions.

I can think of three solutions:

  1. Use a ConcurrentHashMap.
  2. Use a regular HashMap but synchronize on the outside; e.g. using primitive mutexes, Lock objects, etcetera. But beware that this could lead to a concurrency bottleneck due to lock contention.
  3. Use a different HashMap for each thread. If the threads really have a disjoint set of keys, then there should be no need (from an algorithmic perspective) for them to share a single Map. Indeed, if your algorithms involve the threads iterating the keys, values or entries of the map at some point, splitting the single map into multiple maps could give a significant speedup for that part of the processing.

1 - We cannot enumerate all of the possible things that could go wrong. For a start, we can't predict how all JVMs will handle the unspecified aspects of the JMM ... on all platforms. But you should NOT be relying on that kind of information anyway. All you need to know is that it is fundamentally wrong to use a HashMap like this. An application that does this is broken ... even if you haven't observed the symptoms of the brokenness yet.

like image 132
Stephen C Avatar answered Sep 21 '22 21:09

Stephen C


Just use a ConcurrentHashMap. The ConcurrentHashMap uses multiple locks which cover a range of hash buckets to reduce the chances of a lock being contested. There is a marginal performance impact to acquiring an uncontested lock.

To answer your original question: According to the javadoc, as long as the structure of the map doesn't change, your are fine. This mean no removing elements at all and no adding new keys that are not already in the map. Replacing the value associated with existing keys is fine.

If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.)

Though it makes no guarantees about visibility. So you have to be willing to accept retrieving stale associations occasionally.

like image 41
Tim Bender Avatar answered Sep 17 '22 21:09

Tim Bender