Let's consider this code which iterates over a list while removing an item each iteration:
x = list(range(5)) for i in x: print(i) x.pop()
It will print 0, 1, 2
. Only the first three elements are printed since the last two elements in the list were removed by the first two iterations.
But if you try something similar on a dict:
y = {i: i for i in range(5)} for i in y: print(i) y.pop(i)
It will print 0
, then raise RuntimeError: dictionary changed size during iteration
, because we are removing a key from the dictionary while iterating over it.
Of course, modifying a list during iteration is bad. But why is a RuntimeError
not raised as in the case of dictionary? Is there any good reason for this behaviour?
To modify a Python dict while iterating over it, we can use the items method to get the key and value. to loop through the key value pairs in t with t. items() and the for loop. In it, we set t2[k] to the prefix + v where v is the value in the t dict.
The general rule of thumb is that you don't modify a collection/array/list while iterating over it. Use a secondary list to store the items you want to act upon and execute that logic in a loop after your initial loop.
Modifying a value in a dictionary is pretty similar to modifying an element in a list. You give the name of the dictionary and then the key in square brackets, and set that equal to the new value.
The reason is because a dictionary is a lookup, while a list is an iteration. Dictionary uses a hash lookup, while your list requires walking through the list until it finds the result from beginning to the result each time.
I think the reason is simple. list
s are ordered, dict
s (prior to Python 3.6/3.7) and set
s are not. So modifying a list
s as you iterate may be not advised as best practise, but it leads to consistent, reproducible, and guaranteed behaviour.
You could use this, for example let's say you wanted to split a list
with an even number of elements in half and reverse the 2nd half:
>>> lst = [0,1,2,3] >>> lst2 = [lst.pop() for _ in lst] >>> lst, lst2 ([0, 1], [3, 2])
Of course, there are much better and more intuitive ways to perform this operation, but the point is it works.
By contrast, the behaviour for dict
s and set
s is totally implementation specific since the iteration order may change depending on the hashing.
You get a RunTimeError
with collections.OrderedDict
, presumably for consistency with the dict
behaviour. I don't think any change in the dict
behaviour is likely after Python 3.6 (where dict
s are guaranteed to maintain insertion ordered) since it would break backward compatibility for no real use cases.
Note that collections.deque
also raises a RuntimeError
in this case, despite being ordered.
It wouldn't have been possible to add such a check to lists without breaking backward compatibility. For dicts, there was no such issue.
In the old, pre-iterators design, for
loops worked by calling the sequence element retrieval hook with increasing integer indices until it raised IndexError. (I would say __getitem__
, but this was back before type/class unification, so C types didn't have __getitem__
.) len
isn't even involved in this design, and there is nowhere to check for modification.
When iterators were introduced, the dict iterator had the size change check from the very first commit that introduced iterators to the language. Dicts weren't iterable at all before that, so there was no backward compatibility to break. Lists still went through the old iteration protocol, though.
When list.__iter__
was introduced, it was purely a speed optimization, not intended to be a behavioral change, and adding a modification check would have broken backward compatibility with existing code that relied on the old behavior.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With