Is iterating over some_dict.items()
as efficient as iterating over a list of the same items in CPython?
Using iteritems is a tad bit faster... But the time to create a view is negligable; it is actually slower to iterate over than a list. This means that in Python 3, if you want to iterate many times over the items in a dictionary, and performance is critical, you can get a 30% speedup by caching the view as a list.
Analysis Of The Test Run ResultA dictionary is 6.6 times faster than a list when we lookup in 100 items.
The list is an ordered collection of data, whereas the dictionaries store the data in the form of key-value pairs using the hashtable structure. Due to this, fetching the elements from the list data structure is quite complex compared to dictionaries in Python. Therefore, the dictionary is faster than a list in Python.
Python is slow. I bet you might encounter this counterargument many times about using Python, especially from people who come from C or C++ or Java world. This is true in many cases, for instance, looping over or sorting Python arrays, lists, or dictionaries can be sometimes slow.
It depends on which version of Python you're using. In Python 2, some_dict.items()
creates a new list, which takes up some additional time and uses up additional memory. On the other hand, once the list is created, it's a list, and so should have identical performance characteristics after the overhead of list creation is complete.
In Python 3, some_dict.items()
creates a view object instead of a list, and I anticipate that creating and iterating over items()
would be faster than in Python 2, since nothing has to be copied. But I also anticipate that iterating over an already-created view would be a bit slower than iterating over an already-created list, because dictionary data is stored somewhat sparsely, and I believe there's no good way for python to avoid iterating over every bin in the dictionary -- even the empty ones.
In Python 2, some timings confirm my intuitions:
>>> some_dict = dict(zip(xrange(1000), reversed(xrange(1000))))
>>> some_list = zip(xrange(1000), xrange(1000))
>>> %timeit for t in some_list: t
10000 loops, best of 3: 25.6 us per loop
>>> %timeit for t in some_dict.items(): t
10000 loops, best of 3: 57.3 us per loop
Iterating over the items
is roughly twice as slow. Using iteritems
is a tad bit faster...
>>> %timeit for t in some_dict.iteritems(): t
10000 loops, best of 3: 41.3 us per loop
But iterating over the list itself is basically the same as iterating over any other list:
>>> some_dict_list = some_dict.items()
>>> %timeit for t in some_dict_list: t
10000 loops, best of 3: 26.1 us per loop
Python 3 can create and iterate over items
faster than Python 2 can (compare to 57.3 us above):
>>> some_dict = dict(zip(range(1000), reversed(range(1000))))
>>> %timeit for t in some_dict.items(): t
10000 loops, best of 3: 33.4 us per loop
But the time to create a view is negligable; it is actually slower to iterate over than a list.
>>> some_list = list(zip(range(1000), reversed(range(1000))))
>>> some_dict_view = some_dict.items()
>>> %timeit for t in some_list: t
10000 loops, best of 3: 18.6 us per loop
>>> %timeit for t in some_dict_view: t
10000 loops, best of 3: 33.3 us per loop
This means that in Python 3, if you want to iterate many times over the items in a dictionary, and performance is critical, you can get a 30% speedup by caching the view as a list.
>>> some_list = list(some_dict_view)
>>> %timeit for t in some_list: t
100000 loops, best of 3: 18.6 us per loop
A little benchmark shows me that iterating a list is definately faster.
def iterlist(list_):
i = 0
for _ in list_:
i += 1
return i
def iterdict(dict_):
i = 0
for _ in dict_.iteritems():
i += 1
return i
def noiterdict(dict_):
i = 0
for _ in dict_.items():
i += 1
return i
list_ = range(1000000)
dict_ = dict(zip(range(1000000), range(1000000)))
Tested with IPython on Python 2.7 (Kubuntu):
%timeit iterlist(list_)
10 loops, best of 3: 28.5 ms per loop
%timeit iterdict(dict_)
10 loops, best of 3: 39.7 ms per loop
%timeit noiterdict(dict_)
10 loops, best of 3: 86.1 ms per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With