I need to lazily initialize a map and its contents. I have the below code till now:
class SomeClass {
private Map<String, String> someMap = null;
public String getValue(String key) {
if (someMap == null) {
synchronized(someMap) {
someMap = new HashMap<String, String>();
// initialize the map contents by loading some data from the database.
// possible for the map to be empty after this.
}
}
return someMap.get(key); // the key might not exist even after initialization
}
}
This is obviously not thread-safe as if one thread comes when someMap
is null, goes on to initialize the field to new HashMap
and while its still loading the data in the map, another thread does a getValue
and doesn't get the data when one might have existed.
How can I make sure that the data is loaded in the map only once when the first getValue
call happens.
Please note that it's possible that the the key
won't exist in the map after all the initialization. Also, it's possible that the map is simply empty after all the initialization.
Java HashMap is not thread-safe and hence it should not be used in multithreaded applications.
In computer programming, lazy initialization is the tactic of delaying the creation of an object, the calculation of a value, or some other expensive process until the first time it is needed. It is a kind of lazy evaluation that refers specifically to the instantiation of objects or other resources.
1. Overview. Maps are naturally one of the most widely style of Java collection. And, importantly, HashMap is not a thread-safe implementation, while Hashtable does provide thread-safety by synchronizing operations.
ConcurrentHashMap class is thread-safe i.e. multiple threads can operate on a single object without any complications. At a time any number of threads are applicable for a read operation without locking the ConcurrentHashMap object which is not there in HashMap.
Double Check Locking
Double check locking requires several steps all to be completed in order to work properly, you are missing two of them.
First you will need to make someMap
into a volatile
variable. This is so that other threads will see changes made to it when they are made but after the changes are complete.
private volatile Map<String, String> someMap = null;
You also need a second check for null
inside the synchronized
block to make sure that another thread hasn't initialized it for you while you were waiting to enter the synchronized area.
if (someMap == null) {
synchronized(this) {
if (someMap == null) {
Do not assign until ready for use
In your generation of the map construct it in a temp variable then assign it at the end.
Map<String, String> tmpMap = new HashMap<String, String>();
// initialize the map contents by loading some data from the database.
// possible for the map to be empty after this.
someMap = tmpMap;
}
}
}
return someMap.get(key);
To explain why the temporary map is required. As soon as you complete the line someMap = new HashMap...
then someMap
is no longer null. That means other calls to get
will see it and never try to enter the synchronized
block. They will then try to get from the map without waiting for the database calls to complete.
By making sure the assignment to someMap
is the last step within the synchronized block that prevents this from happening.
unmodifiableMap
As discussed in the comments, for safety it would also be best to save the results in an unmodifiableMap
as future modifications would not be thread safe. This is not strictly required for a private variable that is never exposed, but it's still safer for the future as it stops people coming in later and changing the code without realizing.
someMap = Collections.unmodifiableMap(tmpMap);
Why not use ConcurrentMap?
ConcurrentMap
makes individual actions (i.e. putIfAbsent
) thread-safe but it does not meet the fundamental requirement here of waiting until the map is fully populated with data before allowing reads from it.
Additionally in this case the Map after the lazy initialization is not being modified again. The ConcurrentMap
would add synchronization overhead to operations that in this specific use case do not need to be synchronized.
Why synchronize on this?
There is no reason. :) It was just the simplest way to present a valid answer to this question.
It would certainly be better practice to synchronize on a private internal object. You have improved encapsulation traded off for marginally increased memory usage and object creation times. The main risk with synchronizing on this
is that it allows other programmers to access your lock object and potentially try synchronizing on it themselves. This then causes un-needed contention between their updates and yours, so an internal lock object is safer.
In reality though a separate lock object is overkill in many cases. It's a judgement call based on the complexity of your class and how widely is is used against the simplicity of just locking on this
. If in doubt you should probably use an internal lock object and take the safest route.
In the class:
private final Object lock = new Object();
In the method:
synchronized(lock) {
As for java.util.concurrent.locks
objects, they don't add anything useful in this case (although in other cases they are very useful). We always want to wait until the data is available so the standard synchronized block gives us exactly the behavior we need.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With