Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid double check locking when adding items to a Dictionary<> object in .NET?

I have a question about improving the efficiency of my program. I have a Dictionary<string, Thingey> defined to hold named Thingeys. This is a web application that will create multiple named Thingey’s over time. Thingey’s are somewhat expensive to create (not prohibitively so) but I’d like to avoid it whenever possible. My logic for getting the right Thingey for the request looks a lot like this:

    private Dictionary<string, Thingey> Thingeys;
    public Thingey GetThingey(Request request)
    {
        string thingeyName = request.ThingeyName;
        if (!this.Thingeys.ContainsKey(thingeyName))
        {
            // create a new thingey on 1st reference
            Thingey newThingey = new Thingey(request);
            lock (this.Thingeys)
            {
                if (!this.Thingeys.ContainsKey(thingeyName))
                {
                    this.Thingeys.Add(thingeyName, newThingey);
                }
                // else - oops someone else beat us to it
                // newThingey will eventually get GCed
            }
        }

        return this. Thingeys[thingeyName];
    }

In this application, Thingeys live forever once created. We don’t know how to create them or which ones will be needed until the app starts and requests begin coming in. The question I have is in the above code is there are occasional instances where newThingey is created because we get multiple simultaneous requests for it before it’s been created. We end up creating 2 of them but only adding one to our collection. Is there a better way to get Thingeys created and added that doesn’t involve check/create/lock/check/add with the rare extraneous thingey that we created but end up never using? (And this code works and has been running for some time. This is just the nagging bit that has always bothered me.)

I'm trying to avoid locking the dictionary for the duration of creating a Thingey.

like image 823
No Refunds No Returns Avatar asked Nov 25 '09 18:11

No Refunds No Returns


4 Answers

This is the standard double check locking problem. The way it is implemented here is unsafe and can cause various problems - potentially up to the point of a crash in the first check if the internal state of the dictionary is screwed up bad enough.

It is unsafe because you are checking it without synchronization and if your luck is bad enough you can hit it while some other thread is in the middle of updating internal state of the dictionary

A simple solution is to place the first check under a lock as well. A problem with this is that this becomes a global lock and in web environment under heavy load it can become a serious bottleneck.

If we are talking about .NET environment, there are ways to work around this issue by piggybacking on the ASP.NET synchronization mechanism.

Here is how I did it in NDjango rendering engine: I keep one global dictionary and one dictionary per rendering thread. When a request comes I check the local dictionary first - this check does not have to be synchronized and if the thingy is there I just take it

If it is not I synchronize on the global dictionary check if it is there and if it is add it to my thread dictionary and release the lock. If it is not in the global dictionary I add it there first while still under lock.

like image 135
mfeingold Avatar answered Oct 29 '22 00:10

mfeingold


Well, from my point of view simpler code is better, so I'd only use one lock:

private readonly object thingeysLock = new object();
private readonly Dictionary<string, Thingey> thingeys;

public Thingey GetThingey(Request request)
{
    string key = request.ThingeyName;
    lock (thingeysLock)
    {
        Thingey ret;
        if (!thingeys.TryGetValue(key, out ret))
        {
            ret = new Thingey(request);
            thingeys[key] = ret;
        }
        return ret;
    }
}

Locks are really cheap when they're not contended. The downside is that this means that occasionally you will block everyone for the whole duration of the time you're creating a new Thingey. Clearly to avoid creating redundant thingeys you'd have to at least block while multiple threads create the Thingey for the same key. Reducing it so that they only block in that situation is somewhat harder.

I would suggest you use the above code but profile it to see whether it's fast enough. If you really need "only block when another thread is already creating the same thingey" then let us know and we'll see what we can do...

EDIT: You've commented on Adam's answer that you "don't want to lock while a new Thingey is being created" - you do realise that there's no getting away from that if there's contention for the same key, right? If thread 1 starts creating a Thingey, then thread 2 asks for the same key, your alternatives for thread 2 are either waiting or creating another instance.

EDIT: Okay, this is generally interesting, so here's a first pass at the "only block other threads asking for the same item".

private readonly object dictionaryLock = new object();
private readonly object creationLocksLock = new object();
private readonly Dictionary<string, Thingey> thingeys;
private readonly Dictionary<string, object> creationLocks;

public Thingey GetThingey(Request request)
{
    string key = request.ThingeyName;
    Thingey ret;
    bool entryExists;
    lock (dictionaryLock)
    {
       entryExists = thingeys.TryGetValue(key, out ret);
       // Atomically mark the dictionary to say we're creating this item,
       // and also set an entry for others to lock on
       if (!entryExists)
       {
           thingeys[key] = null;
           lock (creationLocksLock)
           {
               creationLocks[key] = new object();          
           }
       }
    }
    // If we found something, great!
    if (ret != null)
    {
        return ret;
    }
    // Otherwise, see if we're going to create it or whether we need to wait.
    if (entryExists)
    {
        object creationLock;
        lock (creationLocksLock)
        {
            creationLocks.TryGetValue(key, out creationLock);
        }
        // If creationLock is null, it means the creating thread has finished
        // creating it and removed the creation lock, so we don't need to wait.
        if (creationLock != null)
        {
            lock (creationLock)
            {
                Monitor.Wait(creationLock);
            }
        }
        // We *know* it's in the dictionary now - so just return it.
        lock (dictionaryLock)
        {
           return thingeys[key];
        }
    }
    else // We said we'd create it
    {
        Thingey thingey = new Thingey(request);
        // Put it in the dictionary
        lock (dictionaryLock)
        {
           thingeys[key] = thingey;
        }
        // Tell anyone waiting that they can look now
        lock (creationLocksLock)
        {
            Monitor.PulseAll(creationLocks[key]);
            creationLocks.Remove(key);
        }
        return thingey;
    }
}

Phew!

That's completely untested, and in particular it isn't in any way, shape or form robust in the face of exceptions in the creating thread... but I think it's the generally right idea :)

like image 31
Jon Skeet Avatar answered Oct 28 '22 22:10

Jon Skeet


If you're looking to avoid blocking unrelated threads, then additional work is needed (and should only be necessary if you've profiled and found that performance is unacceptable with the simpler code). I would recommend using a lightweight wrapper class that asynchronously creates a Thingey and using that in your dictionary.

Dictionary<string, ThingeyWrapper> thingeys = new Dictionary<string, ThingeyWrapper>();

private class ThingeyWrapper
{
    public Thingey Thing { get; private set; }

    private object creationLock;
    private Request request;

    public ThingeyWrapper(Request request)
    {
        creationFlag = new object();
        this.request = request;
    }

    public void WaitForCreation()
    {
        object flag = creationFlag;

        if(flag != null)
        {
            lock(flag)
            {
                if(request != null) Thing = new Thingey(request);

                creationFlag = null;

                request = null;
            }
        }
    }
}

public Thingey GetThingey(Request request)
{
    string thingeyName = request.ThingeyName;

    ThingeyWrapper output;

    lock (this.Thingeys)
    {
        if(!this.Thingeys.TryGetValue(thingeyName, out output))
        {
            output = new ThingeyWrapper(request);

            this.Thingeys.Add(thingeyName, output);
        }
    }

    output.WaitForCreation();

    return output.Thing;
}

While you are still locking on all calls, the creation process is much more lightweight.

Edit

This issue has stuck with me more than I expected it to, so I whipped together a somewhat more robust solution that follows this general pattern. You can find it here.

like image 2
Adam Robinson Avatar answered Oct 29 '22 00:10

Adam Robinson


IMHO, if this piece of code is called from many thread simultaneous, it is recommended to check it twice.

(But: I'm not sure that you can safely call ContainsKey while some other thread is call Add. So it might not be possible to avoid the lock at all.)

If you just want to avoid the Thingy is created but not used, just create it within the locking block:

private Dictionary<string, Thingey> Thingeys;
public Thingey GetThingey(Request request)
{
    string thingeyName = request.ThingeyName;
    if (!this.Thingeys.ContainsKey(thingeyName))
    {
        lock (this.Thingeys)
        {
            // only one can create the same Thingy
            Thingey newThingey = new Thingey(request);
            if (!this.Thingeys.ContainsKey(thingeyName))
            {
                this.Thingeys.Add(thingeyName, newThingey);
            }

        }
    }

    return this. Thingeys[thingeyName];
}
like image 1
Stefan Steinegger Avatar answered Oct 28 '22 22:10

Stefan Steinegger