I am looking at making iterable objects and am wondering which of these two options would be the more pythonic/better way, is there no difference or have I got wrong idea about using yield? To me using yield seems cleaner and apparently it is faster than using __next__() but i'm not sure.
class iterable_class():
def __init__(self, n):
self.i = 0
self.n = n
def __iter__(self):
return self
def __next__(self):
if self.i < self.n:
i = self.i
self.i += 1
return i
else:
raise StopIteration()
Using yield:
class iterable_class_with_generator():
def __init__(self, n):
self.i = 0
self.n = n
def __iter__(self):
while self.i < self.n:
yield self.i
self.i += 1
One observable difference is that the first version implements an iterator (an object that has __next__
and whose __iter__
returns itself), while the second one implements an iterable (an object that can implements __iter__
to create some iterator). In most cases this doesn't make a difference because the for
statement and all itertools accept any iterable.
The difference is visible with the following code:
>>> x = iterable_class(10)
>>> next(x)
0
>>> next(x)
1
>>> list(x)
[2, 3, 4, 5, 6, 7, 8, 9]
Obviously this won't work with iterable_class_with_generator
because it doesn't implement __next__
. But there is a deeper difference: since list(x)
accepts any iterable, it would first call x.__iter__()
, which would in case of iterable_class_with_generator
create a new generator which would start the count from the beginning. A true generator-based iterator is presented at the end of the answer, but in most cases the difference won't matter.
Regarding the style difference of whether to use a generator or define your own __next__
, both will be recognized as correct Python, so you should choose the one that reads better to the person or team who will maintain the code. Since the generator version is shorter and generators are a well-understood Python idiom, I'd choose that one.
Note that if you implement __iter__
with a generator, you don't need to keep iteration state in instance vars because the generator does it for you. The code is then even simpler:
class iterable_class_with_generator:
def __init__(self, n):
self.n = n
def __iter__(self):
for i in range(self.n):
yield i
iterable_class_with_generator
that implements a true iterator that uses a generator internally:
class iterable_class_with_generator:
def __init__(self, n):
self._gen = self._generate(n)
def __iter__(self):
return self
def __next__(self):
return next(self._gen)
def _generate(self, n):
for i in range(n):
yield i
The devil is in the details.
chepner already mentioned the significant difference in the comments.
iterable_class.__iter__
returns the same iterator (namely, itself) each time it is called, whileiterable_class_with_generator.__iter__
returns a new, independent iterator each time.
This can give you surprising results if you are not aware of exactly what's happening.
>>> x = iterable_class_with_generator(5)
>>> it = iter(x)
>>> list(it)
[0, 1, 2, 3, 4]
>>> x.i = 0
>>> list(it)
[]
>>>
>>> x = iterable_class(5)
>>> it = iter(x)
>>> list(it)
[0, 1, 2, 3, 4]
>>> x.i = 0
>>> list(it)
[0, 1, 2, 3, 4]
As you can see, the generator created from calling iter
with an instance of iterable_class_with_generator
stays exhausted once it raises StopIteration
.
The iterator from an instance of iterable_class
is that instance itself, so fiddling with x.i
can change the state of the iterator.
Conclusion:
If you want an iterator, implement __iter__
(which does nothing but return self
) and __next__
. If you want an iterable that is not an iterator itself, implement __iter__
and return an iterator in the body of this method.
Both approaches are different, and when you wanted an iterable that is not an iterator, the subtle difference can bite you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With