ConcurrentDictionary.GetOrAdd(TKey, Func<TKey, TValue>) accepts a factory function to allow lazy instantiation of the item to be put into the dictionary.
Is it safe to define a factory function that itself calls GetOrAdd(), i.e. GetOrAdd is being called within the context of a 'parent' GetOrAdd().
The following code demonstrates the pattern; It does appear to work, but is it safe?
class Program
{
static ConcurrentDictionary<string,object> __dict = new ConcurrentDictionary<string, object>();
static void Main(string[] args)
{
Foo foo = GetOrAddFoo();
Console.WriteLine(foo._name);
Console.WriteLine(foo._bar._name);
Console.ReadKey();
}
static Bar GetOrAddBar()
{
Console.WriteLine("GetOrAddBar: enter");
Func<string,Bar> factoryFn = (x) => LoadBar(x);
Bar bar = __dict.GetOrAdd("bar", factoryFn) as Bar;
Console.WriteLine("GetOrAddBar: exit");
return bar;
}
static Foo GetOrAddFoo()
{
Console.WriteLine("GetOrAddFoo: enter");
Func<string,Foo> factoryFn = (x) => LoadFoo(x);
Foo foo = __dict.GetOrAdd("foo", factoryFn) as Foo;
Console.WriteLine("GetOrAddFoo: exit");
return foo;
}
static Bar LoadBar(string name)
{
Bar bar = new Bar();
bar._name = name;
return bar;
}
static Foo LoadFoo(string name)
{
Foo foo = new Foo();
foo._name = name;
foo._bar = GetOrAddBar();
return foo;
}
public class Foo
{
public string _name;
public Bar _bar;
}
public class Bar
{
public string _name;
}
}
The answer is yes, it is perfectly safe. The value functions are only called if the given key does not exist yet. Below is a walkthrough of what happens under the hood.
First we assume the dictionary is completely empty, which I'll only show the keys in an array format for simplicity:
dictionary = []
On an initial first time execution of the GetOrAddFoo
method, the "Foo" key does not exist so the dictionary invokes the value function, which in this invocation is the LoadFoo
method. The dictionary is still empty here.
dictionary = []
LoadFoo
internally calls GetOrAddBar
, which checks and finds that the "Bar" key does not exist, so the LoadBar
value function is invoked and it returns with the created "Bar" entry. The dictionary at this point looks like this:
dictionary = ["Bar"]
The dictionary at this point contains the item "Bar" in it. We have not yet completed the LoadFoo
value function, but it will now.
dictionary = ["Bar"]
The LoadFoo
method regains control and returns a Foo
object to be stored in the dictionary. After LoadFoo
completes, GetOrAddFoo
can complete too. We now have the dictionary looking like this:
dictionary = ["Bar", "Foo"]
GetOrAddFoo
On subsequent calls to GetOrAddFoo
, the dictionary already has an entry for Foo
and thus will not invoke its value function, or even the Bar
value function either. It returns immediately.
dictionary = ["Bar", "Foo"]
But what happens if we remove Bar
from the dictionary and then call GetOrAddFoo
? Say that we do remove it, leaving our dictionary like this:
dictionary = ["Foo"]
And now we call GetOrAddFoo
again. Foo
still exists, so the dictionary will not call the LoadFoo
value function. And thus, Bar
is not re-added to the dictionary. Our dictionary remains the same:
dictionary = ["Foo"]
If we call GetOrAddBar
directly though, we'll get "Bar" added back in then.
dictionary = ["Foo", "Bar"]
Under the hood, anytime the GetOrAdd
method is called, the existence of the given key is checked first.
If it does NOT exist at that point, the value function is then invoked. AFTER the value function returns, a lock is placed on the dictionary to allow the new entry to be added.
In a multi-threaded world, we could have 2 threads trying to add the same exact key to the dictionary. The dictionary would run the value function, then try to acquire the lock, and after getting a lock, check if the key exists again.
The reason for this is on lock retrieval, another thread could jump in and add the same key. The original thread would need to check for this scenario once it receives the lock, so that it doesn't end up causing a key collision.
When you decompile ConcurrentDictionary.GetOrAdd(TKey, Func) you will see line:
TryAddInternal(key, valueFactory(key), updateIfExists: false, acquireLock: true, out value);
Which means that timeline of your call in one process/thread will be:
enter __dict.GetOrAdd("foo", factoryFn)
enter LoadFoo (from valueFactory(key) in line above)
enter GetOrAddBar
enter __dict.GetOrAdd("bar", factoryFn)
enter LoadBar
leave LoadBar
enter TryAddInternal with key "bar"
acquire lock ( Monitor.Enter(tables.m_locks[lockNo], ref lockTaken); )
add key "bar" with appropriate value
release lock ( Monitor.Exit(tables.m_locks[lockNo]); )
leave TryAddInternal with key "bar"
leave __dict.GetOrAdd("bar", factoryFn)
leave GetOrAddBar
leave LoadFoo
enter TryAddInternal with key "foo"
acquire lock ( Monitor.Enter(tables.m_locks[lockNo], ref lockTaken); )
add key "foo" with appropriate value
release lock ( Monitor.Exit(tables.m_locks[lockNo]); )
leave __dict.GetOrAdd("foo", factoryFn)
You can see that it will lock and release twice and in between another process can jump in and create "foo" when this process have already created "bar". If it's safe depends on you.
And when you'd call recursively with the same key, most in-depth value will "win" because there is updateIfExists: false
param in TryAddInternal so any subsequent call will not change it once it's there. There is also out value
param so it will return first inserted value and will not fail.
Also interesting: TryAddInternal
does not lock whole Dictionary but only a bucket (part of the dicionary) based on the key. It's performance improvement.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With