Modifying a dict during iteration

Question

What's going on below

>>> d = {0:0}
>>> for i in d:
...     del d[i]
...     d[i+1] = 0
...     print(i)
...     
0
1
2
3
4
5
6
7
>>>

Why does the iteration stop at 8 without any error?

Reproducible on both python2.7 and python 3.5.

Daniel · Accepted Answer

The initial size of the key table in a dict is 8 elements. So 0...7 sets the 1st to 8th element and 8 sets the 1st element again, ending the loop.

Source: Objects/dictobject.c

/* PyDict_MINSIZE_COMBINED is the starting size for any new, non-split dict. 8 allows dicts with no more than 5 active entries; experiments suggested this suffices for the majority of dicts (consisting mostly of usually-small dicts created to pass keyword arguments). Making this 8, rather than 4 reduces the number of resizes for most dictionaries, without any significant extra memory use. */

#define PyDict_MINSIZE_COMBINED 8

gdlmx · Answer

This behavior originates from the key look up algorithm in cpython static PyDictKeyEntry * lookdict(...) as written in the document:

The basic lookup function used by all operations. This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4. ... The initial probe index is computed as hash mod the table size (which initially equals to 8).

At the beginning of every for loop, the dict_next function is called internally to resolve the address of the next element. The core of this function reads:

value_ptr = &mp->ma_keys->dk_entries[i].me_value;
mask = DK_MASK(mp->ma_keys); // size of the array which stores the key values (ma_keys)
while (i <= mask && *value_ptr == NULL) { // skip NULL elements ahead 
    value_ptr = (PyObject **)(((char *)value_ptr) + offset);
    i++;
}
if (i > mask)
    return -1; // raise StopIteration

where i is the index of the C array which actually stores the values. As written above, the initial index of a key is calculated from hash(key)%table_size. The other element in the array is all set to NULL since the dict contains only one element in your test case.

Given the fact that hash(i)==i if i is an int, the memory layout of the dict in your example will be:

1st iter: [0,   NULL,NULL,NULL,NULL,NULL,NULL,NULL]; i=0
2nd iter: [NULL,1   ,NULL,NULL,NULL,NULL,NULL,NULL]; i=1
...
8th iter: [NULL,NULL,NULL,NULL,NULL,NULL,NULL,7   ]; i=7

A more interesting test case would be:

def f(d):
  for i in d:
    del d[i]
    print hash(i)%8
    d[str(hash(i))]=0
f({0:0})       # outputs 0,1,6
f({'hello':0}) # outputs 5,7
f({'world':0}) # outputs 1

To conclude, the exiting condition of such loop is

hash(new_key)%8<=hash(old_key)%8

Modifying a dict during iteration

Tags:

python

dictionary

wim

2 Answers

Source: Objects/dictobject.c

Daniel

gdlmx

Recent Activity

Donate For Us

Modifying a dict during iteration

Tags:

python

dictionary

wim

2 Answers

Source: Objects/dictobject.c

Daniel

gdlmx

Related questions

Recent Activity

Donate For Us