Let's consider this code which iterates over a list while removing an item each iteration: <pre class="prettyprint"><code>x = list(range(5)) for i in x: print(i) x.pop() </code></pre> It will print <code>0, 1, 2</code>. Only the first three elements are printed since the last two elements in the list were removed by the first two iterations. But if you try something similar on a dict: <pre class="prettyprint"><code>y = {i: i for i in range(5)} for i in y: print(i) y.pop(i) </code></pre> It will print <code>0</code>, then raise <code>RuntimeError: dictionary changed size during iteration</code>, because we are removing a key from the dictionary while iterating over it. Of course, modifying a list during iteration is bad. But why is a <code>RuntimeError</code> not raised as in the case of dictionary? Is there any good reason for this behaviour?

I think the reason is simple. <code>list</code>s are ordered, <code>dict</code>s (prior to Python 3.6/3.7) and <code>set</code>s are not. So modifying a <code>list</code>s as you iterate may be not advised as best practise, but it leads to consistent, reproducible, and guaranteed behaviour. You could use this, for example let's say you wanted to split a <code>list</code> with an even number of elements in half and reverse the 2nd half: <pre class="prettyprint"><code>>>> lst = [0,1,2,3] >>> lst2 = [lst.pop() for _ in lst] >>> lst, lst2 ([0, 1], [3, 2]) </code></pre> Of course, there are much better and more intuitive ways to perform this operation, but the point is it works. By contrast, the behaviour for <code>dict</code>s and <code>set</code>s is totally implementation specific since the iteration order may change depending on the hashing. You get a <code>RunTimeError</code> with <code>collections.OrderedDict</code>, presumably for consistency with the <code>dict</code> behaviour. I don't think any change in the <code>dict</code> behaviour is likely after Python 3.6 (where <code>dict</code>s are guaranteed to maintain insertion ordered) since it would break backward compatibility for no real use cases. Note that <code>collections.deque</code> also raises a <code>RuntimeError</code> in this case, despite being ordered.

It wouldn't have been possible to add such a check to lists without breaking backward compatibility. For dicts, there was no such issue. In the old, pre-iterators design, <code>for</code> loops worked by calling the sequence element retrieval hook with increasing integer indices until it raised IndexError. (I would say <code>__getitem__</code>, but this was back before type/class unification, so C types didn't have <code>__getitem__</code>.) <code>len</code> isn't even involved in this design, and there is nowhere to check for modification. When iterators were introduced, the dict iterator had the size change check from the very first commit that introduced iterators to the language. Dicts weren't iterable at all before that, so there was no backward compatibility to break. Lists still went through the old iteration protocol, though. When <code>list.__iter__</code> was introduced, it was purely a speed optimization, not intended to be a behavioral change, and adding a modification check would have broken backward compatibility with existing code that relied on the old behavior.

Modify list and dictionary during iteration, why does it fail on dict?

Tags:

python

dictionary

loops

iteration

list

Let's consider this code which iterates over a list while removing an item each iteration:

x = list(range(5))  for i in x:     print(i)     x.pop()

It will print 0, 1, 2. Only the first three elements are printed since the last two elements in the list were removed by the first two iterations.

But if you try something similar on a dict:

y = {i: i for i in range(5)}  for i in y:     print(i)     y.pop(i)

It will print 0, then raise RuntimeError: dictionary changed size during iteration, because we are removing a key from the dictionary while iterating over it.

Of course, modifying a list during iteration is bad. But why is a RuntimeError not raised as in the case of dictionary? Is there any good reason for this behaviour?

676

asked Apr 04 '18 08:04

ducminh

2 Answers

I think the reason is simple. lists are ordered, dicts (prior to Python 3.6/3.7) and sets are not. So modifying a lists as you iterate may be not advised as best practise, but it leads to consistent, reproducible, and guaranteed behaviour.

You could use this, for example let's say you wanted to split a list with an even number of elements in half and reverse the 2nd half:

>>> lst = [0,1,2,3] >>> lst2 = [lst.pop() for _ in lst] >>> lst, lst2 ([0, 1], [3, 2])

Of course, there are much better and more intuitive ways to perform this operation, but the point is it works.

By contrast, the behaviour for dicts and sets is totally implementation specific since the iteration order may change depending on the hashing.

You get a RunTimeError with collections.OrderedDict, presumably for consistency with the dict behaviour. I don't think any change in the dict behaviour is likely after Python 3.6 (where dicts are guaranteed to maintain insertion ordered) since it would break backward compatibility for no real use cases.

Note that collections.deque also raises a RuntimeError in this case, despite being ordered.

116

answered Oct 09 '22 09:10

Chris_Rands

It wouldn't have been possible to add such a check to lists without breaking backward compatibility. For dicts, there was no such issue.

In the old, pre-iterators design, for loops worked by calling the sequence element retrieval hook with increasing integer indices until it raised IndexError. (I would say __getitem__, but this was back before type/class unification, so C types didn't have __getitem__.) len isn't even involved in this design, and there is nowhere to check for modification.

When iterators were introduced, the dict iterator had the size change check from the very first commit that introduced iterators to the language. Dicts weren't iterable at all before that, so there was no backward compatibility to break. Lists still went through the old iteration protocol, though.

When list.__iter__ was introduced, it was purely a speed optimization, not intended to be a behavioral change, and adding a modification check would have broken backward compatibility with existing code that relied on the old behavior.

answered Oct 09 '22 09:10

user2357112 supports Monica

Related questions
                            
                                Sqlite. How to get value of Auto Increment Primary Key after Insert, other than last_insert_rowid()?
                            
                                Adding attributes to instancemethods in Python
                            
                                Why is set_xlim() not setting the x-limits in my figure?
                            
                                What is the equivalent of python any() and all() functions in JavaScript?
                            
                                pandas distinction between str and object types
                            
                                Using pathlib's relative_to for directories on the same level
                            
                                Why does popping from the original list make reversed(original_list) empty?
                            
                                Python - OpenCV - imread - Displaying Image
                            
                                In Python, is there an async equivalent to multiprocessing or concurrent.futures?
                            
                                .ini file load environment variable
                            
                                How can I "unpivot" specific columns from a pandas DataFrame?
                            
                                How to run a Jupyter notebook with Python code automatically on a daily basis?
                            
                                Is there a way to attach a debugger to a multi-threaded Python process?
                            
                                Browser-based application or stand-alone GUI app?
                            
                                Check for mutability in Python?
                            
                                How can I achieve a self-referencing many-to-many relationship on the SQLAlchemy ORM back referencing to the same attribute?
                            
                                Python class inheritance: AttributeError: '[SubClass]' object has no attribute 'xxx'
                            
                                Beginner Python: Reading and writing to the same file
                            
                                Efficient element-wise multiplication of a matrix and a vector in TensorFlow
                            
                                Pandas: group by index value, then calculate quantile?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With