Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

__next__ in generators and iterators and what is a method-wrapper?

I was reading about generator and iterators and the role of __next__() .

'__next__' in dir(mygen). is true

'__next__' in dir(mylist), is false

As I looked deeper into it,

'__next__' in dir (mylist.__iter__()) is true

  1. why is__next__ only available to list but only to __iter__() and mygen but not mylist. How does __iter__() call __next__ when we are stepping thru the list using list-comprehension

    Trying to manually step (+1) up the generator, I called mygen.__next__(). It doesn't exist. It only exist as mygen.__next__which is called method-wrapper.

  2. what is a method-wrapper and what does it do? How is it applied here: in mygen() and __iter__() ?

  3. if __next__ is what both generator and iterator provide (and their sole properties) then what is the difference between generator and iterator?*

    Answer to 3: Solved, as noted by mod/editor:

    Difference between Python's Generators and Iterators

UPDATE: both generator and iterator have __next__(). My mistake. Looking at the logs, somehow mygen.__next__() test was giving me stopiteration exception error. But I wasn't able to duplicate that error again.

Thanks everyone for answering!

like image 591
theMobDog Avatar asked Oct 26 '16 06:10

theMobDog


1 Answers

The special methods __iter__ and __next__ are part of the iterator protocol to create iterator types. For this purpose, you have to differentiate between two separate things: Iterables and iterators.

Iterables are things that can be iterated, usually, these are some kind of container elements that contain items. Common examples are lists, tuples, or dictionaries.

In order to iterate an iterable, you use an iterator. An iterator is the object that helps you iterate through the container. For example, when iterating a list, the iterator essentially keeps track of which index you are currently at.

To get an iterator, the __iter__ method is called on the iterable. This is like a factory method that returns a new iterator for this specific iterable. A type having a __iter__ method defined, turns it into an iterable.

The iterator generally needs a single method, __next__, which returns the next item for the iteration. In addition, to make the protocol easier to use, every iterator should also be an iterable, returning itself in the __iter__ method.

As a quick example, this would be a possible iterator implementation for a list:

class ListIterator:
    def __init__ (self, lst):
        self.lst = lst
        self.idx = 0

    def __iter__ (self):
        return self

    def __next__ (self):
        try:
            item = self.lst[self.idx]
        except IndexError:
            raise StopIteration()
        self.idx += 1
        return item

The list implementation could then simply return ListIterator(self) from the __iter__ method. Of course, the actual implementation for lists is done in C, so this looks a bit different. But the idea is the same.

Iterators are used invisibly in various places in Python. For example a for loop:

for item in lst:
    print(item)

This is kind of the same to the following:

lst_iterator = iter(lst) # this just calls `lst.__iter__()`
while True:
    try:
        item = next(lst_iterator) # lst_iterator.__next__()
    except StopIteration:
        break
    else:
        print(item)

So the for loop requests an iterator from the iterable object, and then calls __next__ on that iterable until it hits the StopIteration exception. That this happens under the surface is also the reason why you would want iterators to implement the __iter__ as well: Otherwise you could never loop over an iterator.


As for generators, what people usually refer to is actually a generator function, i.e. some function definition that has yield statements. Once you call that generator function, you get back a generator. A generator is esentially just an iterator, albeit a fancy one (since it does more than move through a container). As an iterator, it has a __next__ method to “generate” the next element, and a __iter__ method to return itself.


An example generator function would be the following:

def exampleGenerator():
    yield 1
    print('After 1')
    yield 2
    print('After 2')

The function body containing a yield statement turns this into a generator function. That means that when you call exampleGenerator() you get back a generator object. Generator objects implement the iterator protocol, so we can call __next__ on it (or use the the next() function as above):

>>> x = exampleGenerator()
>>> next(x)
1
>>> next(x)
After 1
2
>>> next(x)
After 2
Traceback (most recent call last):
  File "<pyshell#10>", line 1, in <module>
    next(x)
StopIteration

Note that the first next() call did not print anything yet. This is the special thing about generators: They are lazy and only evaluate as much as necessary to get the next item from the iterable. Only with the second next() call, we get the first printed line from the function body. And we need another next() call to exhaust the iterable (since there’s not another value yielded).

But apart from that laziness, generators just act like iterables. You even get a StopIteration exception at the end, which allows generators (and generator functions) to be used as for loop sources and wherever “normal” iterables can be used.

The big benefit of generators and their laziness is the ability to generate stuff on demand. A nice analogy for this is endless scrolling on websites: You can scroll down item after after (calling next() on the generator), and every once in a while, the website will have to query a backend to retrieve more items for you to scroll through. Ideally, this happens without you noticing. And that’s exactly what a generator does. It even allows for things like this:

def counter():
    x = 0
    while True:
        x += 1
        yield x

Non-lazy, this would be impossible to compute since this is an infinite loop. But lazily, as a generator, it’s possible to consume this iterative one item after an item. I originally wanted to spare you from implementing this generator as a fully custom iterator type, but in this case, this actually isn’t too difficult, so here it goes:

class CounterGenerator:
    def __init__ (self):
        self.x = 0

    def __iter__ (self):
        return self

    def __next__ (self):
        self.x += 1
        return self.x
like image 103
poke Avatar answered Oct 09 '22 21:10

poke