Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is Python's iterator unpacking (star unpacking) implemented (or, what magic methods are involved in unpacking a custom iterator?)

I am writing a class that defines __iter__ and __len__, where the value of __len__ depends on the iterator returned by __iter__. I am getting an interesting RecursionError.

Language versions: Python 3.8.6, 3.7.6. Examples are for illustrating the error only.

In the following example, Iter.__len__() attempts to unpack self, store the result in a list, and then attempts to call the built-in list.__len__() on that list to get the length.

>>> class Iter:
...     def __iter__(self):
...         return range(5).__iter__()
...     def __len__(self):
...         return list.__len__([*self])
...
>>> len(Iter())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in __len__
  File "<stdin>", line 5, in __len__
  File "<stdin>", line 5, in __len__
  [Previous line repeated 993 more times]
  File "<stdin>", line 3, in __iter__
RecursionError: maximum recursion depth exceeded in comparison

However, if I define the class Iter as the following, where Iter.__len__() explicitly unpacks the iterator as returned by Iter.__iter__():

>>> class Iter:
...     def __iter__(self):
...         return range(5).__iter__()
...     def __len__(self):
...         return list.__len__([*self.__iter__()])
...
>>> len(Iter())
5

Then there is no error.

From the traceback, it seems that list.__len__() is trying to call Iter.__len__(), even thought the argument provided is supposedly already a native list object. What is the reason for the RecursionError?


According to schwobaseggl, using set instead of list will not cause a RecursionError:

>>> class Iter:
...     def __iter__(self):
...         return range(5).__iter__()
...     def __len__(self):
...         return set.__len__({*self})
...
>>> len(Iter())
5
like image 696
Tony Wu Avatar asked Dec 23 '20 17:12

Tony Wu


People also ask

How do I unpack an iterable in Python?

There are 2 ways to unpack iterables in Python. For known length iterables - Providing the exact number of variables to unpack as the number of elements in the sequence/iterable. For arbitrary length iterables - Using star expressions (*) to unpack when you are unsure of the number of variables to pass.

What is unpacking in Python explain shortly with example?

Unpacking in Python refers to an operation that consists of assigning an iterable of values to a tuple (or list ) of variables in a single assignment statement. As a complement, the term packing can be used when we collect several values in a single variable using the iterable unpacking operator, * .

What does __ ITER __ do in Python?

The __iter__() function returns an iterator for the given object (array, set, tuple, etc. or custom objects). It creates an object that can be accessed one element at a time using __next__() function, which generally comes in handy when dealing with loops.


1 Answers

It has little to do with unpacking as such, but with the implementations of different collection types, their constructors in particular.

[*iterable]  # list
(*iterable,) # tuple
{*iterable}  # set

all trigger calls to their classes' respective constructors.

From the current C implementation for list(iterable):

list___init___impl(PyListObject *self, PyObject *iterable) {
    /* ... */
    if (iterable != NULL) {
        if (_PyObject_HasLen(iterable)) {
            Py_ssize_t iter_len = PyObject_Size(iterable);
            if (iter_len == -1) {
                if (!PyErr_ExceptionMatches(PyExc_TypeError)) {
                    return -1;
                }
                PyErr_Clear();
            }
            if (iter_len > 0 && self->ob_item == NULL
                && list_preallocate_exact(self, iter_len)) {
                return -1;
            }
        }
        PyObject *rv = list_extend(self, iterable);
        /* ... */
}

As can be seen (even with such limited C knowledge as mine), the iterable is tested for its size in order to allocate the right amount of memory which is what triggers the calls to __len__ of the passed iterable.

Unsurprisingly, it can be verified that set does no such thing. After all, the relation between the size of the passed iterable and the size of the resulting set is nowhere near as direct as it is for lists or tuples. For instance, think of set([1] * 10**5). It would be foolish to use the size information of the passed list to allocate memory for the set.

On a side note, as has been pointed out in the comments and many other questions/answers on this site (e.g. here):
If you want to determine the length of an iterable, there are more (mainly space-)efficient ways than to collect all items into a Sized collection, e.g.:

def __len__(self):
    return sum(1 for _ in self)
like image 51
user2390182 Avatar answered Oct 19 '22 14:10

user2390182