I am writing a class that defines __iter__
and __len__
, where the value of __len__
depends on the iterator returned by __iter__
. I am getting an interesting RecursionError
.
Language versions: Python 3.8.6, 3.7.6. Examples are for illustrating the error only.
In the following example, Iter.__len__()
attempts to unpack self
, store the result in a list
, and then attempts to call the built-in list.__len__()
on that list to get the length.
>>> class Iter:
... def __iter__(self):
... return range(5).__iter__()
... def __len__(self):
... return list.__len__([*self])
...
>>> len(Iter())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in __len__
File "<stdin>", line 5, in __len__
File "<stdin>", line 5, in __len__
[Previous line repeated 993 more times]
File "<stdin>", line 3, in __iter__
RecursionError: maximum recursion depth exceeded in comparison
However, if I define the class Iter
as the following, where Iter.__len__()
explicitly unpacks the iterator as returned by Iter.__iter__()
:
>>> class Iter:
... def __iter__(self):
... return range(5).__iter__()
... def __len__(self):
... return list.__len__([*self.__iter__()])
...
>>> len(Iter())
5
Then there is no error.
From the traceback, it seems that list.__len__()
is trying to call Iter.__len__()
, even thought the argument provided is supposedly already a native list
object. What is the reason for the RecursionError
?
According to schwobaseggl, using set
instead of list
will not cause a RecursionError
:
>>> class Iter:
... def __iter__(self):
... return range(5).__iter__()
... def __len__(self):
... return set.__len__({*self})
...
>>> len(Iter())
5
There are 2 ways to unpack iterables in Python. For known length iterables - Providing the exact number of variables to unpack as the number of elements in the sequence/iterable. For arbitrary length iterables - Using star expressions (*) to unpack when you are unsure of the number of variables to pass.
Unpacking in Python refers to an operation that consists of assigning an iterable of values to a tuple (or list ) of variables in a single assignment statement. As a complement, the term packing can be used when we collect several values in a single variable using the iterable unpacking operator, * .
The __iter__() function returns an iterator for the given object (array, set, tuple, etc. or custom objects). It creates an object that can be accessed one element at a time using __next__() function, which generally comes in handy when dealing with loops.
It has little to do with unpacking as such, but with the implementations of different collection types, their constructors in particular.
[*iterable] # list
(*iterable,) # tuple
{*iterable} # set
all trigger calls to their classes' respective constructors.
From the current C implementation for list(iterable)
:
list___init___impl(PyListObject *self, PyObject *iterable) {
/* ... */
if (iterable != NULL) {
if (_PyObject_HasLen(iterable)) {
Py_ssize_t iter_len = PyObject_Size(iterable);
if (iter_len == -1) {
if (!PyErr_ExceptionMatches(PyExc_TypeError)) {
return -1;
}
PyErr_Clear();
}
if (iter_len > 0 && self->ob_item == NULL
&& list_preallocate_exact(self, iter_len)) {
return -1;
}
}
PyObject *rv = list_extend(self, iterable);
/* ... */
}
As can be seen (even with such limited C knowledge as mine), the iterable is tested for its size in order to allocate the right amount of memory which is what triggers the calls to __len__
of the passed iterable.
Unsurprisingly, it can be verified that set
does no such thing. After all, the relation between the size of the passed iterable and the size of the resulting set is nowhere near as direct as it is for lists or tuples. For instance, think of set([1] * 10**5)
. It would be foolish to use the size information of the passed list to allocate memory for the set.
On a side note, as has been pointed out in the comments and many other questions/answers on this site (e.g. here):
If you want to determine the length of an iterable
, there are more (mainly space-)efficient ways than to collect all items into a Sized
collection, e.g.:
def __len__(self):
return sum(1 for _ in self)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With