Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dictionary.Count performance

Tags:

c#

dictionary

This question seems to be nonsense. The behaviour cannot be reproduced reliably.

Comparing the following test programs, I observed a huge performance difference between the first and the second of the following examples (the first example is by factor ten slower than the second):

First example (slow):

interface IWrappedDict {
    int Number { get; }
    void AddSomething (string k, string v);
}

class WrappedDict : IWrappedDict {
    private Dictionary<string, string> dict = new Dictionary<string,string> ();


    public void AddSomething (string k, string v) {
        dict.Add (k, v);
    }

    public int Number { get { return dict.Count; } }
}

class TestClass {
    private IWrappedDict wrappedDict;

    public TestClass (IWrappedDict theWrappedDict) {
        wrappedDict = theWrappedDict;
    }

    public void DoSomething () {
        // this function does the performance test
        for (int i = 0; i < 1000000; ++i) {
            var c = wrappedDict.Number; wrappedDict.AddSomething (...);
        }
    }
}

Second example (fast):

// IWrappedDict as above
class WrappedDict : IWrappedDict {
    private Dictionary<string, string> dict = new Dictionary<string,string> ();
    private int c = 0;

    public void AddSomething (string k, string v) {
        dict.Add (k, v); ++ c;
    }

    public int Number { get { return c; } }
}
// rest as above

Funnily, the difference vanishes (the first example gets fast as well) if I change the type of the member variable TestClass.wrappedDict from IWrappedDict to WrappedDict. My interpretation of this is that Dictionary.Count re-counts the elements every time it is accessed and that potential caching of the number of elements is done by compiler optimization only.

Can anybody confirm this? Is there any way to get the number of elements in a Dictionary in a performant way?

like image 607
JohnB Avatar asked Jan 02 '13 09:01

JohnB


2 Answers

No, Dictionary.Count does not recount the elements every time it's used. The dictionary maintains a count, and should be as fast as your second version.

I suspect that in your test of the second example, you already had WrappedDict instead of IWrappedDict, and this is actually about interface member access (which is always virtual) and the JIT compiling inlining calls to the property when it knows the concrete type.

If you still believe Count is the problem, you should be able to edit your question to show a short but complete program which demonstrates both the fast and slow versions, including how you're timing it all.

like image 100
Jon Skeet Avatar answered Oct 12 '22 03:10

Jon Skeet


Sound like your timing is off; I get:

#1: 330ms
#2: 335ms

when running the following in release mode, outside of the IDE:

public void DoSomething(int count) {
    // this function does the performance test
    for (int i = 0; i < count; ++i) {
        var c = wrappedDict.Number; wrappedDict.AddSomething(i.ToString(), "a");
    }
}
static void Execute(int count, bool show)
{
    var obj1 = new TestClass(new WrappedDict1());
    var obj2 = new TestClass(new WrappedDict2());

    GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
    GC.WaitForPendingFinalizers();
    var watch = Stopwatch.StartNew();
    obj1.DoSomething(count);
    watch.Stop();
    if(show) Console.WriteLine("#1: {0}ms", watch.ElapsedMilliseconds);

    GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
    GC.WaitForPendingFinalizers();
    watch = Stopwatch.StartNew();
    obj2.DoSomething(count);
    watch.Stop();
    if(show) Console.WriteLine("#2: {0}ms", watch.ElapsedMilliseconds);
}
static void Main()
{
    Execute(1, false); // for JIT
    Execute(1000000, true); // for measuring
}

Basically: "cannot reproduce". Also: for completeness, no: .Count does not count all the items (it already knows the count), nor does the compiler add any magic automatic caching code (note: there are a few limited examples of things like that; for example, the JIT can remove bounds-checking on a for loop over a vector).

like image 23
Marc Gravell Avatar answered Oct 12 '22 04:10

Marc Gravell