Given a dict iterator, get the dict

Tags:

Given a list iterator, you can find the original list via the pickle protocol:

>>> L = [1, 2, 3]
>>> Li = iter(L)
>>> Li.__reduce__()[1][0] is L
True

Given a dict iterator, how can you find the original dict? I could only find a hacky way using CPython implementation details (via garbage collector):

Click to copy

>>> def get_dict(dict_iterator): 
...     [d] = gc.get_referents(dict_iterator) 
...     return d 
...
>>> d = {}
>>> get_dict(iter(d)) is d
True

334

asked Oct 02 '19 19:10

wim

1 Answers

There is no API to find the source iterable object from an iterator. This is intentional, iterators are seen as single-use objects; iterate and discard. A such, they often drop their iterable reference once they have reached the end; what's the point of keeping it if you can't get more elements, anyway?

You see this in both the list and dict iterators, the hacks you found either produce empty objects or None once you are done iterating. List iterators use an empty list when pickled:

Click to copy

>>> l = [1]
>>> it = iter(l)
>>> it.__reduce__()[1][0] is l
True
>>> list(it)  # exhaust the iterator
[1]
>>> it.__reduce__()[1][0] is l
False
>>> it.__reduce__()[1][0]
[]

and the dictionary iterator just sets the pointer to the original dictionary to null, so there are no referents left after that:

Click to copy

>>> import gc
>>> it = iter({'foo': 42})
>>> gc.get_referents(it)
[{'foo': 42}]
>>> list(it)
['foo']
>>> gc.get_referents(it)
[]

Both your hacks are just that: hacks. They are implementation dependent and can and probably will change between Python releases. Currently, using iter(dictionary).__reduce__() gets you the equivalent of iter, list(copy(self)) and rather than access to the dictionary because that's deemed a better implementation, but future versions might use something different altogether, etc.

For dictionaries, the only other option currently available is to access the di_dict pointer in the dictiter struct, with ctypes:

Click to copy

import ctypes

class PyObject_HEAD(ctypes.Structure):
    _fields_ = [
        ("ob_refcnt", ctypes.c_ssize_t),
        ("ob_type", ctypes.c_void_p),
    ]

class dictiterobject(ctypes.Structure):
    _fields_ = [
        ("ob_base", PyObject_HEAD),
        ("di_dict", ctypes.py_object),
        ("di_used", ctypes.c_ssize_t),
        ("di_pos", ctypes.c_ssize_t),
        ("di_result", ctypes.py_object),  # always NULL for dictkeys_iter
        ("len", ctypes.c_ssize_t),
    ]

def dict_from_dictiter(it):
    di = dictiterobject.from_address(id(it))
    try:
        return di.di_dict
    except ValueError:  # null pointer
        return None

This is just as much of a hack as relying on gc.get_referents():

Click to copy

>>> d = {'foo': 42}
>>> it = iter(d)
>>> dict_from_dictiter(it)
{'foo': 42}
>>> dict_from_dictiter(it) is d
True
>>> list(it)
['foo']
>>> dict_from_dictiter(it) is None
True

For now, at least in CPython versions up to and including Python 3.8, there are no other options available.

191

answered Sep 27 '22 16:09

Martijn Pieters

Related questions
                            
                                Can pip (python2) and pip3 (python3) coexist?
                            
                                Multiple ranges / np.arange [duplicate]
                            
                                what is the difference between conv2d and Conv2D in Keras?
                            
                                How to speed up symbolic derivatives of long functions using SymPy?
                            
                                DataFrame object has no attribute 'name'
                            
                                Sending RabbitMq messages between Docker containers using docker-compose
                            
                                How do I alias a python module at packaging time?
                            
                                Is ray `num_cpus` used to actually allocate CPUs?
                            
                                How does .corr remove NA and null values?
                            
                                How can I tidy (melt) data in Pandas and keep all other columns?
                            
                                Why can't get string with PIL and pytesseract?
                            
                                Can I split this column containing a mix of tuples/None more efficiently?
                            
                                HTTPConnectionPool(host=\'0.0.0.0\', port=7000): Max retries exceeded with url (Caused by NewConnectionError
                            
                                How do I create a CSV in Lambda using Python?
                            
                                How to provide custom formatting from format string?
                            
                                I'm not able to use python requests session cookies in selenium
                            
                                Fill rows with consecutive values and above rows using pandas
                            
                                Setting meld as git mergetool with Python3
                            
                                Trouble getting the screenshot of any element after zooming in
                            
                                Keras you are trying to load a weight file containing 2 layers into a model with 1 layers

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Given a dict iterator, get the dict

Tags:

python

iterator

dictionary

python-internals

wim

People also ask

1 Answers

Martijn Pieters

Recent Activity

Donate For Us