I am in the middle of developing a custom persistent Key Value type data structure, to compare against SqlLite and Berkley DB. Anyway before I wrote the implementation I wanted to find the best data structure to use for this purposes. I looked at the a couple:
I wanted the datastructures I picked to have performance numbers comparable to the .net dictionary.
I used a simple test for loop with 500k iterations for inserts and used the stopwatch to measure inserts and key look up:
I notice that
Insert time: 7% slower than the .net dictionary.
Lookup time: 1000% slower than the .net dictionary. This is even slower than the look up speed with sqllite!! I attempted to perform the test with compiler optimization turned on and still got similar results.
I realize I am comparing Hashtables vs trees etc, but I stumped as to the performance discrepancy between all the data structures.
Anybody have any ideas
Two thoughts:
You should make sure you are not inadvertently including JIT time in your tests - this can add a considerable amount of time to the result. You should perform two runs in the same execution and discard the first run.
You should make sure that you are not running under the debugger - this can dramatically skew performance results.
Aside form that, any performance differences you see may very well be the result of the difference in performance between a hash table and a tree. A tree structure typically has O(n*log(n)) performance on average for a lookup. A balanced tree can reduce that to O(lon(n)). Hashtables, meanwhile, can approach O(1) time for lookups when hash collisions are avoided.
I would also imagine that the .NET Dictionary class is highly optimized since it is a bread-and-butter data structure for so many different things in .NET. Also, a generic Dictionary<> may be able to avoid boxing, and therefore you may see some performance differences from that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With