Is there a performance difference in using a tuple over a frozenset as a key for a dictionary?

Tags:

python

I have a script that makes many calls to a dictionary using a key consisting of two variables. I know that my program will encounter the two variables again in the reverse order which makes storing the key as a tuple feasible. (Creating a matrix with the same labels for rows and columns)

Therefore, I was wondering if there was a performance difference in using a tuple over a frozenset for a dictionary key.

912

asked May 01 '12 13:05

chrtan

2 Answers

In a quick test, apparently it makes a negligible difference.

python -m timeit -s "keys = list(zip(range(10000), range(10, 10000)))" -s "values = range(10000)" -s "a=dict(zip(keys, values))" "for i in keys:" "  _ = a[i]"
1000 loops, best of 3: 855 usec per loop

python -m timeit -s "keys = [frozenset(i) for i in zip(range(10000), range(10, 10000))]" -s "values = range(10000)" -s "a=dict(zip(keys, values))" "for i in keys:" "  _ = a[i]"
1000 loops, best of 3: 848 usec per loop

I really would just go with what is best elsewhere in your code.

answered Sep 30 '22 15:09

Gareth Latty

Without having done any tests, I have a few guesses. For frozensets, cpython stores the hash after it has been calculated; furthermore, iterating over a set of any kind incurs extra overhead because the data is stored sparsely. In a 2-item set, that imposes a significant performance penalty on the first hash, but would probably make the second hash very fast -- at least when the object itself is the same. (i.e. is not a new but equivalent frozenset.)

For tuples, cpython does not store the hash, but rather calculates it every time. So it might be that repeated hashing is slightly cheaper with frozensets. But for such a short tuple, there's probably almost no difference; it's even possible that very short tuples will be faster.

Lattyware's current timings line up reasonably well with my line of reasoning here; see below.

To test my intuition about the asymmetry of hashing new vs. old frozensets, I did the following. I believe the difference in timings is exclusively due to the extra hash time. Which is pretty insignificant, by the way:

>>> fs = frozenset((1, 2))
>>> old_fs = lambda: [frozenset((1, 2)), fs][1]
>>> new_fs = lambda: [frozenset((1, 2)), fs][0]
>>> id(fs) == id(old_fs())
True
>>> id(fs) == id(new_fs())
False
>>> %timeit hash(old_fs())
1000000 loops, best of 3: 642 ns per loop
>>> %timeit hash(new_fs())
1000000 loops, best of 3: 660 ns per loop

Note that my previous timings were wrong; using and created a timing asymmetry that the above method avoids. This new method produces expected results for tuples here -- negligable timing difference:

>>> tp = (1, 2)
>>> old_tp = lambda: [tuple((1, 2)), tp][1]
>>> new_tp = lambda: [tuple((1, 2)), tp][0]
>>> id(tp) == id(old_tp())
True
>>> id(tp) == id(new_tp())
False
>>> %timeit hash(old_tp())
1000000 loops, best of 3: 533 ns per loop
>>> %timeit hash(new_tp())
1000000 loops, best of 3: 532 ns per loop

And, the coup de grace, comparing hash time for a pre-constructred frozenset to hash time for a pre-constructed tuple:

>>> %timeit hash(fs)
10000000 loops, best of 3: 82.2 ns per loop
>>> %timeit hash(tp)
10000000 loops, best of 3: 93.6 ns per loop

Lattyware's results look more like this because they are an average of results for new and old frozensets. (They hash each tuple or frozenset twice, once in creating the dictionary, once in accessing it.)

The upshot of all this is that it probably doesn't matter, except to those of us who enjoy digging around in Python's internals and testing things into oblivion.

answered Sep 30 '22 15:09

senderle

Related questions
                            
                                What does this Python one-liner mean?
                            
                                Why does Python let me define a variable in one scope, but use it in another?
                            
                                Is there a python builtin to create tuples from multiple lists?
                            
                                XPath predicate with sub-paths with lxml?
                            
                                In Python, how do I check if a string has alphabets or numbers?
                            
                                Finding all possible case permutations in Python [duplicate]
                            
                                How to pack a UUID into a struct in Python?
                            
                                Why Python for web development? [closed]
                            
                                Python string operation, extract text between html tags
                            
                                Best (most "pythonic") way to temporarily unzip a file
                            
                                converting a string to a list of tuples
                            
                                Is there a django template equivalent of the python string multiplication function?
                            
                                python: Accessing an instance variable using a name containing a variable
                            
                                In Python, fastest way to build a list of files in a directory with a certain extension
                            
                                Django - Filtering in DetailView
                            
                                Add text to QPlainTextEdit in PyQT (the result is a status log)
                            
                                url design: ways to hide pk/id from url
                            
                                Is it alright to call len() in a loop's conditional statement?
                            
                                How to profile Google App Engine python27 runtime (not python)
                            
                                Get sums of pairs of elements in a numpy array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With