Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modifying a dictionary while iterating over it. Bug in Python dict?

The output of

d = {1: 1}
for k in d.keys():
    d['{}'.format(k)] = d.pop(k)
print(d)

is {'1': 1}. The output of

d = {1: 1}
for k in d.keys():
    d['i{}'.format(k)] = d.pop(k)
print(d)

is {'iiiii1': 1}. Is this a bug? I am running Python 3.6.1 :: Anaconda 4.4.0 (x86_64).

like image 599
user2725109 Avatar asked Jun 26 '17 15:06

user2725109


People also ask

How do you modify a dictionary in Python while iterating?

To modify a Python dict while iterating over it, we can use the items method to get the key and value. to loop through the key value pairs in t with t. items() and the for loop. In it, we set t2[k] to the prefix + v where v is the value in the t dict.

Can dictionary keys be changed during iteration?

It is not allowed to change the size of a dictionary while iterating over it. One way to solve the error is to use the dict. copy method to create a shallow copy of the dictionary and iterate over the copy.

Can you modify dict in Python?

Modifying a value in a dictionary is pretty similar to modifying an element in a list. You give the name of the dictionary and then the key in square brackets, and set that equal to the new value.

Can we modify list while iterating Python?

The general rule of thumb is that you don't modify a collection/array/list while iterating over it. Use a secondary list to store the items you want to act upon and execute that logic in a loop after your initial loop.


1 Answers

No, this is not a bug. This is in fact explicitly documented:

Keys and values are iterated over in an arbitrary order which is non-random, varies across Python implementations, and depends on the dictionary’s history of insertions and deletions. If keys, values and items views are iterated over with no intervening modifications to the dictionary, the order of items will directly correspond.

[...]

Iterating views while adding or deleting entries in the dictionary may raise a RuntimeError or fail to iterate over all entries.

Bold emphasis mine.

You are iterating over the keys, while at the same time both adding and deleting entries in the dictionary. That worked for a few iterations, and then you hit a fail to iterate over all entries case and iteration stopped.

What happens is that you trigger a re-size at 6 additions, and that causes iteration to fail at that point; the 'next' key is now slotted in an 'earlier' slot. This happens for both tests, you just don't realise it iterates 5 times in both cases:

>>> d = {1: 1}
>>> for i, k in enumerate(d):
...     print(i)
...     d['{}'.format(k)] = d.pop(k)
...
0
1
2
3
4
>>> d = {1: 1}
>>> for i, k in enumerate(d):
...     print(i)
...     d['i{}'.format(k)] = d.pop(k)
...
0
1
2
3
4

It is run 5 times because the current dict implementation starts with a hash table of size 8, and a resize is triggered when the table is 2/3s full (your initial dict has 1 entry, 5 insertions make it > (8 * 2/3 == 5.333333). The table is getting filled with DKIX_DUMMY entities, entered when you delete a key (needed to handle hash collisions correctly).

Note that this is all highly implementation dependent. In Python 3.5 and before, both snippets iterate just once (even if you use for k in d: to avoid creating a list object for the keys); iteration continues in 3.6 because the implementation changed and iteration now follows insertion order. Future Python versions are free to alter the implementation again.

like image 192
Martijn Pieters Avatar answered Oct 26 '22 08:10

Martijn Pieters