Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dictionary access speed comparison with integer key against string key

I've got a large dictionary from which I have to look up for values a lot of times. My keys are integers but represent labels so do not need to be added, subtracted, etc... I ended up trying to assess access time between string key and integer key dictionary and here is the result.

from timeit import Timer  Dint = dict() Dstr = dict()  for i in range(10000):     Dint[i] = i     Dstr[str(i)] = i   print 'string key in Dint', print(Timer("'7498' in Dint", "from __main__ import Dint").timeit(100000000)) print 'int key in Dint', print(Timer("7498 in Dint", "from __main__ import Dint").timeit(100000000)) print 'string key in Dstr', print(Timer("'7498' in Dstr", "from __main__ import Dstr").timeit(100000000)) print 'int key in Dstr', print(Timer("7498 in Dstr", "from __main__ import Dstr").timeit(100000000)) 

which produces slight variations between runs reproduced each time :

string key in Dint 4.5552944017 int key in Dint 7.14334390267 string key in Dstr 6.69923791116 int key in Dstr 5.03503126455 

Does it prove that using dictionary with strings as keys is faster to access than with integers as keys?

like image 949
fallino Avatar asked Dec 06 '11 16:12

fallino


People also ask

Are dict keys slow?

Using dictionary. keys() is slower because it does more work: It adds an attribute lookup; dictionary.

Can we use integer as key in dictionary?

Second, a dictionary key must be of a type that is immutable. For example, you can use an integer, float, string, or Boolean as a dictionary key. However, neither a list nor another dictionary can serve as a dictionary key, because lists and dictionaries are mutable.

Is it fast to search for a value in a dictionary python?

Lookups are faster in dictionaries because Python implements them using hash tables. If we explain the difference by Big O concepts, dictionaries have constant time complexity, O(1) while lists have linear time complexity, O(n).

Can we use string as key in dictionary?

The dictionary webstersDict used strings as keys in the dictionary, but dictionary keys can be any immutable data type (numbers, strings, tuples etc).


2 Answers

CPython's dict implementation is in fact optimized for string key lookups. There are two different functions, lookdict and lookdict_string (lookdict_unicode in Python 3), which can be used to perform lookups. Python will use the string-optimized version until a search for non-string data, after which the more general function is used. You can look at the actual implementation by downloading CPython's source and reading through dictobject.c.

As a result of this optimization, lookups are faster when a dict has all string keys.

like image 145
zeekay Avatar answered Sep 27 '22 17:09

zeekay


I'm afraid your times don't really prove very much.

Your test for string in Dint is fastest: in general a test for anything that is not in a dictionary is quite likely to be fast, but that's only because you were lucky and first time hit an empty cell so the lookup could terminate. If you were unlucky and chose a value that hit one or more full cells then it could end up slower than the cases that actually find something.

Testing for an arbitrary string in a dictionary has to calculate the hash code for the string. That takes time proportional to the length of the string, but Python has a neat trick and only ever calculates it once for each string. Since you use the same string over and over in your timing test the time taken to calculate the hash is lost as it only happens the first time and not the other 99999999 times. If you were using a different string each time you would get a very different result.

Python has optimised code for dictionaries where the keys are strings. Overall you should find that using string keys where you use the same keys multiple times is slightly faster, but if you have to keep converting integers to string before the lookup you'll lose that advantage.

like image 21
Duncan Avatar answered Sep 27 '22 16:09

Duncan