Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Iterable using yield or __next__()

I am looking at making iterable objects and am wondering which of these two options would be the more pythonic/better way, is there no difference or have I got wrong idea about using yield? To me using yield seems cleaner and apparently it is faster than using __next__() but i'm not sure.

class iterable_class():

    def __init__(self, n):
        self.i = 0
        self.n = n

    def __iter__(self):
        return self

    def __next__(self):
        if self.i < self.n:
            i = self.i
            self.i += 1
            return i
        else:
            raise StopIteration()

Using yield:

class iterable_class_with_generator():

    def __init__(self, n):
        self.i = 0
        self.n = n

    def __iter__(self):
        while self.i < self.n:
            yield self.i
            self.i += 1
like image 906
Burton2000 Avatar asked Apr 16 '18 17:04

Burton2000


2 Answers

One observable difference is that the first version implements an iterator (an object that has __next__ and whose __iter__ returns itself), while the second one implements an iterable (an object that can implements __iter__ to create some iterator). In most cases this doesn't make a difference because the for statement and all itertools accept any iterable.

The difference is visible with the following code:

>>> x = iterable_class(10)
>>> next(x)
0
>>> next(x)
1
>>> list(x)
[2, 3, 4, 5, 6, 7, 8, 9]

Obviously this won't work with iterable_class_with_generator because it doesn't implement __next__. But there is a deeper difference: since list(x) accepts any iterable, it would first call x.__iter__(), which would in case of iterable_class_with_generator create a new generator which would start the count from the beginning. A true generator-based iterator is presented at the end of the answer, but in most cases the difference won't matter.

Regarding the style difference of whether to use a generator or define your own __next__, both will be recognized as correct Python, so you should choose the one that reads better to the person or team who will maintain the code. Since the generator version is shorter and generators are a well-understood Python idiom, I'd choose that one.

Note that if you implement __iter__ with a generator, you don't need to keep iteration state in instance vars because the generator does it for you. The code is then even simpler:

class iterable_class_with_generator:
    def __init__(self, n):
        self.n = n

    def __iter__(self):
        for i in range(self.n):
            yield i


Finally, here is a version of iterable_class_with_generator that implements a true iterator that uses a generator internally:
class iterable_class_with_generator:
    def __init__(self, n):
        self._gen = self._generate(n)

    def __iter__(self):
        return self

    def __next__(self):
        return next(self._gen)

    def _generate(self, n):
        for i in range(n):
            yield i
like image 157
user4815162342 Avatar answered Oct 31 '22 00:10

user4815162342


The devil is in the details.

chepner already mentioned the significant difference in the comments.

iterable_class.__iter__ returns the same iterator (namely, itself) each time it is called, while iterable_class_with_generator.__iter__ returns a new, independent iterator each time.

This can give you surprising results if you are not aware of exactly what's happening.

>>> x = iterable_class_with_generator(5)
>>> it = iter(x)
>>> list(it)
[0, 1, 2, 3, 4]
>>> x.i = 0
>>> list(it)
[]
>>> 
>>> x = iterable_class(5)
>>> it = iter(x)
>>> list(it)
[0, 1, 2, 3, 4]
>>> x.i = 0
>>> list(it)
[0, 1, 2, 3, 4]

As you can see, the generator created from calling iter with an instance of iterable_class_with_generator stays exhausted once it raises StopIteration.

The iterator from an instance of iterable_class is that instance itself, so fiddling with x.i can change the state of the iterator.

Conclusion:

If you want an iterator, implement __iter__ (which does nothing but return self) and __next__. If you want an iterable that is not an iterator itself, implement __iter__ and return an iterator in the body of this method.

Both approaches are different, and when you wanted an iterable that is not an iterator, the subtle difference can bite you.

like image 2
timgeb Avatar answered Oct 31 '22 00:10

timgeb