Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Atomic AddOrUpdate for a C# Dictionary

Suppose the following code:

if (myDictionary.ContainsKey(aKey))
    myDictionary[aKey] = aValue;
else
    myDictionary.Add(aKey, aValue);

This code accesses the dictionary two times, once for determining whether aKey exist, another time for updating (if exist) or adding (if does not exist). I guess the performance of this method is "acceptable" when this code is executed for only few times. However, in my application similar code is executed roughly 500K times. I profiled my code, and it shows 80% of CPU time spent on this section (see following figure), so this motivates an improvement.

Note that, the dictionary is lambdas.

First workaround is simply:

myDictionary[aKey] = aValue;

If aKey exist it's value is replaced with aValue; if does not exist, a KeyValuePair with aKey as key and aValue as value is added to myDictionary. However, this method has two drawbacks:

First, you don't know if aKey exist or not that prevents you from additional logics. For instance, you can not rewrite following code based on this workaround:

int addCounter = 0, updateCounter = 0;
if (myDictionary.ContainsKey(aKey))
{
    myDictionary[aKey] = aValue;
    addCounter++;
}
else
{
    myDictionary.Add(aKey, aValue);
    updateCounter++;
}

Second, the update can not be a function of the old value. For instance, you can not do a logic similar to:

if (myDictionary.ContainsKey(aKey))    
    myDictionary[aKey] = (myDictionary[aKey] * 2) + aValue;    
else    
    myDictionary.Add(aKey, aValue);

Second workaround is to use ConcurrentDictionary. It's clear that using delegates we can solve the second aforementioned issue; however, still it is not clear to me how we can address the first issue.

Just to remind, my concern is to speed-up. Given that there is only one thread using this procedure, I don't think the penalty of concurrency (with locks) for only one thread worth using ConcurrentDictionary.

Am I missing a point ? does anyone has a better suggestion ?

like image 685
Hamed Avatar asked Nov 16 '15 10:11

Hamed


1 Answers

If you really want AddOrUpdate method like in ConcurrentDictionary but without performance implications of using one, you will have to implement such Dictionary yourself.

The good news is that since CoreCLR is open source, you can take actual .Net Dictionary source from CoreCLR repository and apply your own modification. It seems it will not be so hard, take a look at the Insert private method there.

One possible implementation would be (untested):

public void AddOrUpdate(TKey key, Func<TKey, TValue> adder, Func<TKey, TValue, TValue> updater) {

    if( key == null ) {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
    }

    if (buckets == null) Initialize(0);
    int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
    int targetBucket = hashCode % buckets.Length;

    for (int i = buckets[targetBucket]; i >= 0; i = entries[i].next) {
        if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) {
            entries[i].value = updater(key, entries[i].value);
            version++;
            return;
        } 

    }
    int index;
    if (freeCount > 0) {
        index = freeList;
        freeList = entries[index].next;
        freeCount--;
    }
    else {
        if (count == entries.Length)
        {
            Resize();
            targetBucket = hashCode % buckets.Length;
        }
        index = count;
        count++;
    }

    entries[index].hashCode = hashCode;
    entries[index].next = buckets[targetBucket];
    entries[index].key = key;
    entries[index].value = adder(key);
    buckets[targetBucket] = index;
    version++;

}
like image 167
ghord Avatar answered Sep 28 '22 05:09

ghord