Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detecting that an object is repeatedly iterable

Does obj == iter(obj) imply that obj isn't repeatedly iterable and vice versa? I didn't see any such wording in the docs, but according to this comment, the standard library checks if an object is repeatedly iterable by testing if iter(obj) is obj:

@agf: There are parts of the Python standard library that rely on this part of the spec; they detect whether something is an iterator/generator by testing if iter(obj) is obj:, because a true iterator/generator object will have __iter__ defined as the identity function. If the test is true, they convert to list to allow repeated iteration, otherwise, it's assumed that the object is repeatably iterable, and they can use it as is.
– ShadowRanger Jun 3 at 17:23

The docs do state that if obj is an iterator, it's required that iter(obj) returns obj. But I don't think that's enough to conclude that non-repeatedly iterable objects can be identified using iter(obj) is obj.

like image 878
max Avatar asked Nov 06 '16 02:11

max


1 Answers

All iterators are iterables, but not all iterables are iterators.

The only requirement of an iterable is that it defines an __iter__() method which returns an iterator:

One method needs to be defined for container objects to provide iteration support:

container.__iter__()
Return an iterator object.

An iterator must follow the iterator protocol, which has two requirements:

  1. It has an __iter__() method that returns the object itself:

    iterator.__iter__()
    Return the iterator object itself.

  2. It has a __next__() method which returns the next item on each call, and, once exhausted, raises StopIteration on every subsequent call:

    Once an iterator’s __next__() method raises StopIteration, it must continue to do so on subsequent calls. Implementations that do not obey this property are deemed broken.

These requirements mean that iterators are never repeatable, and that you can always confirm that an iterable is an iterator (and therefore unrepeatable by definition) by confirming that iter(obj) is obj is True:

def is_unrepeatable(obj):
    return iter(obj) is obj

However: since the only requirement of an iterable is that iter(obj) returns some iterator, you can't prove that it is repeatable. An iterable could define an __iter__() method which returns an iterator with different behaviour each time it's called: for instance, it could return an iterator which iterates over its elements the first time it's called, but on subsequent calls, return an iterator which immediately raises StopIteration.

This behaviour would be strange (and annoying), but is not prohibited. Here's an example of a non-repeatable iterable class which is not an iterator:

class Unrepeatable:

    def __init__(self, iterable):
        self.iterable = iterable
        self.exhausted = False

    def __iter__(self):
        if self.exhausted:
            return
        else:
            self.exhausted = True
            yield from self.iterable

>>> x = Unrepeatable([1,2,3])
>>> list(x)
[1, 2, 3]
>>> list(x)
[]
>>> iter(x) is x
False
>>> 

I wouldn't hesitate to call such a "fake iterator" badly-behaved, and I can't think of a situation where you'd find one in the wild, but as demonstrated above, it is possible.

like image 100
Zero Piraeus Avatar answered Nov 14 '22 23:11

Zero Piraeus