What's going on below
>>> d = {0:0}
>>> for i in d:
... del d[i]
... d[i+1] = 0
... print(i)
...
0
1
2
3
4
5
6
7
>>>
Why does the iteration stop at 8 without any error?
Reproducible on both python2.7 and python 3.5.
The initial size of the key table in a dict is 8 elements. So 0
...7
sets the 1st to 8th element and 8 sets the 1st element again, ending the loop.
/* PyDict_MINSIZE_COMBINED is the starting size for any new, non-split dict. 8 allows dicts with no more than 5 active entries; experiments suggested this suffices for the majority of dicts (consisting mostly of usually-small dicts created to pass keyword arguments). Making this 8, rather than 4 reduces the number of resizes for most dictionaries, without any significant extra memory use. */
#define PyDict_MINSIZE_COMBINED 8
This behavior originates from the key look up algorithm in cpython static PyDictKeyEntry * lookdict(...)
as written in the document:
The basic lookup function used by all operations. This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4. ... The initial probe index is computed as hash mod the table size (which initially equals to 8).
At the beginning of every for loop, the dict_next
function is called internally to resolve the address of the next element. The core of this function reads:
value_ptr = &mp->ma_keys->dk_entries[i].me_value;
mask = DK_MASK(mp->ma_keys); // size of the array which stores the key values (ma_keys)
while (i <= mask && *value_ptr == NULL) { // skip NULL elements ahead
value_ptr = (PyObject **)(((char *)value_ptr) + offset);
i++;
}
if (i > mask)
return -1; // raise StopIteration
where i
is the index of the C array which actually stores the values. As written above, the initial index of a key is calculated from hash(key)%table_size
. The other element in the array is all set to NULL
since the dict contains only one element in your test case.
Given the fact that hash(i)==i
if i
is an int, the memory layout of the dict in your example will be:
1st iter: [0, NULL,NULL,NULL,NULL,NULL,NULL,NULL]; i=0
2nd iter: [NULL,1 ,NULL,NULL,NULL,NULL,NULL,NULL]; i=1
...
8th iter: [NULL,NULL,NULL,NULL,NULL,NULL,NULL,7 ]; i=7
A more interesting test case would be:
def f(d):
for i in d:
del d[i]
print hash(i)%8
d[str(hash(i))]=0
f({0:0}) # outputs 0,1,6
f({'hello':0}) # outputs 5,7
f({'world':0}) # outputs 1
To conclude, the exiting condition of such loop is
hash(new_key)%8<=hash(old_key)%8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With