Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Given a dict iterator, get the dict

Given a list iterator, you can find the original list via the pickle protocol:

>>> L = [1, 2, 3]
>>> Li = iter(L)
>>> Li.__reduce__()[1][0] is L
True

Given a dict iterator, how can you find the original dict? I could only find a hacky way using CPython implementation details (via garbage collector):

>>> def get_dict(dict_iterator): 
...     [d] = gc.get_referents(dict_iterator) 
...     return d 
...
>>> d = {}
>>> get_dict(iter(d)) is d
True
like image 334
wim Avatar asked Oct 02 '19 19:10

wim


People also ask

How do you iterate through a dictionary in a dictionary?

Iterate over all values of a nested dictionary in python For a normal dictionary, we can just call the items() function of dictionary to get an iterable sequence of all key-value pairs. But in a nested dictionary, a value can be an another dictionary object.

How do you traverse a dictionary in a loop?

To iterate through a dictionary in Python, there are four main approaches you can use: create a for loop, use items() to iterate through a dictionary's key-value pairs, use keys() to iterate through a dictionary's keys, or use values() to iterate through a dictionary's values.

What will be return by dict items () *?

Python Dictionary items() Method The items() method returns a view object. The view object contains the key-value pairs of the dictionary, as tuples in a list.

How do you get a specific value from a dict?

You can use the get() method of the dictionary ( dict ) to get any default value without an error if the key does not exist. Specify the key as the first argument. The corresponding value is returned if the key exists, and None is returned if the key does not exist.


1 Answers

There is no API to find the source iterable object from an iterator. This is intentional, iterators are seen as single-use objects; iterate and discard. A such, they often drop their iterable reference once they have reached the end; what's the point of keeping it if you can't get more elements, anyway?

You see this in both the list and dict iterators, the hacks you found either produce empty objects or None once you are done iterating. List iterators use an empty list when pickled:

>>> l = [1]
>>> it = iter(l)
>>> it.__reduce__()[1][0] is l
True
>>> list(it)  # exhaust the iterator
[1]
>>> it.__reduce__()[1][0] is l
False
>>> it.__reduce__()[1][0]
[]

and the dictionary iterator just sets the pointer to the original dictionary to null, so there are no referents left after that:

>>> import gc
>>> it = iter({'foo': 42})
>>> gc.get_referents(it)
[{'foo': 42}]
>>> list(it)
['foo']
>>> gc.get_referents(it)
[]

Both your hacks are just that: hacks. They are implementation dependent and can and probably will change between Python releases. Currently, using iter(dictionary).__reduce__() gets you the equivalent of iter, list(copy(self)) and rather than access to the dictionary because that's deemed a better implementation, but future versions might use something different altogether, etc.

For dictionaries, the only other option currently available is to access the di_dict pointer in the dictiter struct, with ctypes:

import ctypes

class PyObject_HEAD(ctypes.Structure):
    _fields_ = [
        ("ob_refcnt", ctypes.c_ssize_t),
        ("ob_type", ctypes.c_void_p),
    ]

class dictiterobject(ctypes.Structure):
    _fields_ = [
        ("ob_base", PyObject_HEAD),
        ("di_dict", ctypes.py_object),
        ("di_used", ctypes.c_ssize_t),
        ("di_pos", ctypes.c_ssize_t),
        ("di_result", ctypes.py_object),  # always NULL for dictkeys_iter
        ("len", ctypes.c_ssize_t),
    ]

def dict_from_dictiter(it):
    di = dictiterobject.from_address(id(it))
    try:
        return di.di_dict
    except ValueError:  # null pointer
        return None

This is just as much of a hack as relying on gc.get_referents():

>>> d = {'foo': 42}
>>> it = iter(d)
>>> dict_from_dictiter(it)
{'foo': 42}
>>> dict_from_dictiter(it) is d
True
>>> list(it)
['foo']
>>> dict_from_dictiter(it) is None
True

For now, at least in CPython versions up to and including Python 3.8, there are no other options available.

like image 191
Martijn Pieters Avatar answered Sep 27 '22 16:09

Martijn Pieters