Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the point of this C# Dictionary<,> optimization?

Tags:

c#

I'm using C# 4.0. "Optimize code" in Visual Studio is turned on.

Consider the following code, in a class:

Dictionary<int, int> dictionary = new Dictionary<int, int>();

public void IncrementDictionary(int key) {
    if (!dictionary.ContainsKey(key)) {
        dictionary[key] = 1;
    } else {
        dictionary[key]++;
    }
}

Here, a call to IncrementDictionary does one of two things:

  • If no value exists for key, then a value is created and initialized to 1.
  • If a value exists, the value is incremented by 1.

Now watch what happens when I use ILSpy to decompile the result:

Dictionary<int, int> dictionary = new Dictionary<int, int>();

public void IncrementDictionary(int key) {
    if (!dictionary.ContainsKey(key)) {
        dictionary[key] = 1;
        return;
    }
    Dictionary<int, int> dictionary2;
    (dictionary2 = dictionary)[key] = dictionary2[key] + 1;
}

Note: In the actual production code using this, the optimizer/compiler also creates: int key2 = key; and uses key2 in the final line.

Ok, var has been replaced by Dictionary<int, int>, which is expected. And the if statement was simplified to add a return instead of using an else.

But why the heck was a new reference to the original dictionary created?

like image 418
Chris Laplante Avatar asked Jun 27 '12 22:06

Chris Laplante


1 Answers

I'm guessing it could be to avoid a race condition where if you have:

dictionary[i] = dictionary[i] + 1

It's not atomic. The dictionary being assigned to could change after you got the value and incremented.

Imagine this code:

public Dictionary<int, int> dictionary = new Dictionary<int, int>();

public void Increment()
{
    int newValue = dictionary[0] + 1;
    //meanwhile, right now in another thread: dictionary = new Dictionary<int, int>();
    dictionary[0] = newValue; //at this point, "dictionary" is actually pointing to a whole new instance
}

With the local variable assignment they have, it looks more like this to avoid the condition:

public void IncrementFix()
{
    var dictionary2 = dictionary;
    //in another thread: dictionary = new Dictionary<int, int>();
    //this is OK, dictionary2 is still pointing to the ORIGINAL dictionary
    int newValue = dictionary2[0] + 1;
    dictionary2[0] = newValue;
}

Note that it doesn't fully satisfy all thread-safe requirements. For example, in this case, we start incrementing the value, but the dictionary reference on the class has changed to a whole new instance. But if you need that higher level of thread safety, then you need to implement your own aggressive synchronization/locking, which is generally outside the scope of compiler optimizations. However this one, from what I can tell, doesn't really add any big hit (if any) to processing and avoids this condition. This might especially be the case if dictionary is a property not a field as it is in your example, in which case it would definitely be an optimization to not resolve the property getter twice. (By any chance, is your actual code using a property for the dictionary rather than the field you posted?)

EDIT: Well, for a simple method:

public void IncrementDictionary() 
{
    dictionary[0]++;
}

The IL reported from LINQPad is:

IL_0000:  nop         
IL_0001:  ldarg.0     
IL_0002:  ldfld       UserQuery.dictionary
IL_0007:  dup         
IL_0008:  stloc.0     
IL_0009:  ldc.i4.0    
IL_000A:  ldloc.0     
IL_000B:  ldc.i4.0    
IL_000C:  callvirt    System.Collections.Generic.Dictionary<System.Int32,System.Int32>.get_Item
IL_0011:  ldc.i4.1    
IL_0012:  add         
IL_0013:  callvirt    System.Collections.Generic.Dictionary<System.Int32,System.Int32>.set_Item
IL_0018:  nop         
IL_0019:  ret         

I'm not entirely sure (I'm not an IL wiz), but I think the dup call essentially doubles up the same dictionary reference on the stack so regardless both the get and set calls will point to the same dictionary. Perhaps this is just how ILSpy represents it as C# code (it's more or less the same at least as behaviour goes). I think. Please correct me if I'm wrong because, like I said, I don't know IL like the back of my hand yet.

EDIT: Gotta run, but the final gist of it is: ++ and += are not an atomic operations and is actually a good deal more complicated in executed instructions than as depicted in C#. As such, to make sure that that each of the get/increment/set steps are performed on the same dictionary instance (as you would expect and require from the C# code) a local reference to the dictionary is made to avoid running the field "get" operation twice which could result to pointing to a new instance. How ILSpy depicts that all that work involved with a indexed += operation is up to it.

like image 78
Chris Sinclair Avatar answered Nov 08 '22 03:11

Chris Sinclair