Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can ConcurrentDictionary.GetOrAdd() be called recursively?

ConcurrentDictionary.GetOrAdd(TKey, Func<TKey, TValue>) accepts a factory function to allow lazy instantiation of the item to be put into the dictionary.

Is it safe to define a factory function that itself calls GetOrAdd(), i.e. GetOrAdd is being called within the context of a 'parent' GetOrAdd().

The following code demonstrates the pattern; It does appear to work, but is it safe?

class Program
{
    static ConcurrentDictionary<string,object> __dict = new ConcurrentDictionary<string, object>();

    static void Main(string[] args)
    {
        Foo foo = GetOrAddFoo();
        Console.WriteLine(foo._name);
        Console.WriteLine(foo._bar._name);
        Console.ReadKey();
    }

    static Bar GetOrAddBar()
    {
        Console.WriteLine("GetOrAddBar: enter");
        Func<string,Bar> factoryFn = (x) => LoadBar(x);
        Bar bar = __dict.GetOrAdd("bar", factoryFn) as Bar;
        Console.WriteLine("GetOrAddBar: exit");
        return bar;
    }

    static Foo GetOrAddFoo()
    {
        Console.WriteLine("GetOrAddFoo: enter");
        Func<string,Foo> factoryFn = (x) => LoadFoo(x);
        Foo foo = __dict.GetOrAdd("foo", factoryFn) as Foo;
        Console.WriteLine("GetOrAddFoo: exit");
        return foo;
    }

    static Bar LoadBar(string name)
    {
        Bar bar =  new Bar();
        bar._name = name;
        return bar;
    }

    static Foo LoadFoo(string name)
    {
        Foo foo = new Foo();
        foo._name = name;
        foo._bar = GetOrAddBar();
        return foo;
    }

    public class Foo
    {
       public string _name;
       public Bar _bar;
    }

    public class Bar
    {
        public string _name;
    }
}
like image 207
redcalx Avatar asked Nov 14 '16 13:11

redcalx


2 Answers

The answer is yes, it is perfectly safe. The value functions are only called if the given key does not exist yet. Below is a walkthrough of what happens under the hood.

Walkthrough

First we assume the dictionary is completely empty, which I'll only show the keys in an array format for simplicity:

dictionary = []

On an initial first time execution of the GetOrAddFoo method, the "Foo" key does not exist so the dictionary invokes the value function, which in this invocation is the LoadFoo method. The dictionary is still empty here.

dictionary = []

LoadFoo internally calls GetOrAddBar, which checks and finds that the "Bar" key does not exist, so the LoadBar value function is invoked and it returns with the created "Bar" entry. The dictionary at this point looks like this:

dictionary = ["Bar"]

The dictionary at this point contains the item "Bar" in it. We have not yet completed the LoadFoo value function, but it will now.

dictionary = ["Bar"]

The LoadFoo method regains control and returns a Foo object to be stored in the dictionary. After LoadFoo completes, GetOrAddFoo can complete too. We now have the dictionary looking like this:

dictionary = ["Bar", "Foo"]

Future Calls of GetOrAddFoo

On subsequent calls to GetOrAddFoo, the dictionary already has an entry for Foo and thus will not invoke its value function, or even the Bar value function either. It returns immediately.

dictionary = ["Bar", "Foo"]

But what happens if we remove Bar from the dictionary and then call GetOrAddFoo? Say that we do remove it, leaving our dictionary like this:

dictionary = ["Foo"]

And now we call GetOrAddFoo again. Foo still exists, so the dictionary will not call the LoadFoo value function. And thus, Bar is not re-added to the dictionary. Our dictionary remains the same:

dictionary = ["Foo"]

If we call GetOrAddBar directly though, we'll get "Bar" added back in then.

dictionary = ["Foo", "Bar"]

Under the hood of the Concurrent Dictionary's Get-Or-Add method

Under the hood, anytime the GetOrAdd method is called, the existence of the given key is checked first.

If it does NOT exist at that point, the value function is then invoked. AFTER the value function returns, a lock is placed on the dictionary to allow the new entry to be added.

In a multi-threaded world, we could have 2 threads trying to add the same exact key to the dictionary. The dictionary would run the value function, then try to acquire the lock, and after getting a lock, check if the key exists again.

The reason for this is on lock retrieval, another thread could jump in and add the same key. The original thread would need to check for this scenario once it receives the lock, so that it doesn't end up causing a key collision.

like image 89
ajawad987 Avatar answered Oct 19 '22 20:10

ajawad987


When you decompile ConcurrentDictionary.GetOrAdd(TKey, Func) you will see line:

TryAddInternal(key, valueFactory(key), updateIfExists: false, acquireLock: true, out value);

Which means that timeline of your call in one process/thread will be:

enter __dict.GetOrAdd("foo", factoryFn)
enter LoadFoo (from valueFactory(key) in line above)
enter GetOrAddBar
enter __dict.GetOrAdd("bar", factoryFn)
enter LoadBar
leave LoadBar
enter TryAddInternal with key "bar" 
acquire lock ( Monitor.Enter(tables.m_locks[lockNo], ref lockTaken); )
add key "bar" with appropriate value
release lock ( Monitor.Exit(tables.m_locks[lockNo]); )
leave TryAddInternal with key "bar" 
leave __dict.GetOrAdd("bar", factoryFn)
leave GetOrAddBar
leave LoadFoo
enter TryAddInternal with key "foo" 
acquire lock ( Monitor.Enter(tables.m_locks[lockNo], ref lockTaken); )
add key "foo" with appropriate value
release lock ( Monitor.Exit(tables.m_locks[lockNo]); )
leave __dict.GetOrAdd("foo", factoryFn)

You can see that it will lock and release twice and in between another process can jump in and create "foo" when this process have already created "bar". If it's safe depends on you.

And when you'd call recursively with the same key, most in-depth value will "win" because there is updateIfExists: false param in TryAddInternal so any subsequent call will not change it once it's there. There is also out value param so it will return first inserted value and will not fail.

Also interesting: TryAddInternal does not lock whole Dictionary but only a bucket (part of the dicionary) based on the key. It's performance improvement.

like image 41
ElMent Avatar answered Oct 19 '22 20:10

ElMent