Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing Lookups: Dictionary key lookups vs. Array index lookups

I'm writing a 7 card poker hand evaluator as one of my pet projects. While trying to optimize its speed (I like the challenge), I was shocked to find that the performance of Dictionary key lookups was quite slow compared to array index lookups.

For example, I ran this sample code that enumerates over all 52 choose 7 = 133,784,560 possible 7 card hands:

var intDict = new Dictionary<int, int>(); var intList = new List<int>(); for (int i = 0; i < 100000; i ++) {     intDict.Add(i, i);       intList.Add(i); }  int result;  var sw = new Stopwatch(); sw.Start(); for (int card1 = 0; card1 < 46; card1++)   for (int card2 = card1 + 1; card2 < 47; card2++)     for (int card3 = card2 + 1; card3 < 48; card3++)       for (int card4 = card3 + 1; card4 < 49; card4++)         for (int card5 = card4 + 1; card5 < 50; card5++)           for (int card6 = card5 + 1; card6 < 51; card6++)             for (int card7 = card6 + 1; card7 < 52; card7++)               result = intDict[32131]; // perform C(52,7) dictionary key lookups sw.Stop(); Console.WriteLine("time for dictionary lookups: {0} ms", sw.ElapsedMilliseconds);  sw.Reset();  sw.Start(); for (int card1 = 0; card1 < 46; card1++)   for (int card2 = card1 + 1; card2 < 47; card2++)     for (int card3 = card2 + 1; card3 < 48; card3++)       for (int card4 = card3 + 1; card4 < 49; card4++)         for (int card5 = card4 + 1; card5 < 50; card5++)           for (int card6 = card5 + 1; card6 < 51; card6++)             for (int card7 = card6 + 1; card7 < 52; card7++)               result = intList[32131]; // perform C(52,7) array index lookups sw.Stop(); Console.WriteLine("time for array index lookups: {0} ms", sw.ElapsedMilliseconds); 

which outputs:

time for dictionary lookups: 2532 ms time for array index lookups: 313 ms 

Is this type of behavior expected (performance decrease by a factor of 8)? IIRC, a Dictionary has, on average, O(1) lookups, while an array has worst-case O(1) lookups, so I do expect the array lookups to be faster, but not by this much!

I am currently storing poker hand rankings in a Dictionary. I suppose if this is as fast as the dictionary lookups can be, I have to rethink my approach and use arrays instead, although indexing the rankings will get a little tricky and I'll probably have to ask another question about it.

like image 739
snazzer Avatar asked May 25 '09 21:05

snazzer


People also ask

Are dictionaries faster than arrays?

If you are going to get elements by positions (index) in the array then array will be quicker (or at least not slower than dictionary). If you are going to search for elements in the array than dictionary will be faster.

Which is faster dictionary or list for lookup?

A dictionary is 6.6 times faster than a list when we lookup in 100 items.

Why is it faster to iterate through a dictionary compared to a list?

The reason is because a dictionary is a lookup, while a list is an iteration. Dictionary uses a hash lookup, while your list requires walking through the list until it finds the result from beginning to the result each time.

In what ways is a dictionary similar to an array In what ways are they different?

Arraylists just store a set of objects (that can be accessed randomly). Dictionaries store pairs of objects. This makes array/lists more suitable when you have a group of objects in a set (prime numbers, colors, students, etc.). Dictionaries are better suited for showing relationships between a pair of objects.


1 Answers

Don't forget that Big-O notations only says how the complexity grows with respect to the size (etc) - it doesn't give any indication of the constant factors involved. That's why sometimes even a linear search for keys is faster than a dictionary lookup, when there are sufficiently few keys. In this case you're not even doing a search with the array though - just a straight indexing operation.

For straight index lookups, arrays are basically ideal - it's just a case of

pointer_into_array = base_pointer + offset * size 

(And then a pointer dereference.)

Performing a dictionary lookup is relatively complicated - very fast compared with (say) a linear lookup by key when there are lots of keys, but much more complicated than a straight array lookup. It has to calculate the hash of the key, then work out which bucket that should be in, possibly deal with duplicate hashes (or duplicate buckets) and then check for equality.

As always, choose the right data structure for the job - and if you really can get away with just indexing into an array (or List<T>) then yes, that will be blindingly fast.

like image 65
Jon Skeet Avatar answered Sep 19 '22 20:09

Jon Skeet