Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generator expression never raises StopIteration

Inspired by my own answer, I didn't even understand how it worked myself, consider the following:

def has22(nums):
    it = iter(nums)
    return any(x == 2 == next(it) for x in it)


>>> has22([2, 1, 2])
False

I expected a StopIteration to be raised, since upon reaching 2, next(it) would be advancing a consumed iterator. However it appears that this behavior has been completely disabled, for generator expressions only! The generator expression seems to immediately break once this happens.

>>> it = iter([2, 1, 2]); any(x == 2 == next(it) for x in it)
False
>>> it = iter([2, 1, 2]); any([x == 2 == next(it) for x in it])

Traceback (most recent call last):
  File "<pyshell#114>", line 1, in <module>
    it = iter([2, 1, 2]); any([x == 2 == next(it) for x in it])
StopIteration
>>> def F(nums):
        it = iter(nums)
        for x in it:
            if x == 2 == next(it): return True


>>> F([2, 1, 2])

Traceback (most recent call last):
  File "<pyshell#117>", line 1, in <module>
    F([2, 1, 2])
  File "<pyshell#116>", line 4, in F
    if x == 2 == next(it): return True
StopIteration

Even this works!

>>> it=iter([2, 1, 2]); list((next(it), next(it), next(it), next(it))for x in it)
[]

So I guess my question is, why is this behavior enabled for generator expressions?

Note: Same behavior in 3.x

like image 743
jamylak Avatar asked May 29 '13 12:05

jamylak


2 Answers

The devs have decided that allowing this was a mistake because it can mask obscure bugs. Because of that, the acceptance of PEP 479 means this is going away.

In Python 3.5 if you do from __future__ import generator_stop, and in Python 3.7 by default, the example in the question will fail with a RuntimeError. You could still achieve the same effect (allowing nums to not be precomputed) with some itertools magic:

from itertools import tee, islice

def has22(nums):
    its = tee(nums, 2)
    return any(x == y == 2 for x, y in 
               zip(its[0], islice(its[1], 1, None)))

The reason it ever worked in the first place has to do with how generators work. You can think of this for loop:

for a in b:
    # do stuff

As being (roughly) equivalent to this:

b = iter(b) 
while True:
    try:
        a = next(b)
    except StopIteration:
        break
    else:
        # do stuff

Now, all the examples have two for loops nested together (one in the generator expression, one in the function consuming it), so that the inner loop iterates once when the outer loop performs its next call. What happens when the '# do stuff' in the inner loop is raise StopIteration?

>>> def foo(): raise StopIteration
>>> list(foo() for x in range(10))
[]

The exception propagates out of the inner loop, since it isn't in its guard, and gets caught by the outer loop. Under the new behavior, Python will intercept a StopIteration that is about to propagate out of a generator and replace it with a RuntimeError, which won't be caught by the containing for loop.

This also has the implication that code like this:

def a_generator():
     yield 5
     raise StopIteration

will also fail, and the mailing list thread gives the impression that this was considered bad form anyway. The proper way to do this is:

def a_generator():
    yield 5
    return

As you pointed out, list comprehensions already behave differently:

>>> [foo() for x in range(10)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <listcomp>
  File "<stdin>", line 1, in foo
StopIteration

This is somewhat an implementation detail leaking - list comprehensions don't get transformed into a call to list with an equivalent generator expression, and apparently doing so would cause large performance penalties that the powers that be consider prohibitive.

like image 98
lvc Avatar answered Sep 24 '22 16:09

lvc


Interesting behaviour, but absolutely understandable.

If you transform your generator expression to a generator:

def _has22_iter(it):
    for x in it:
        yield x == 2 and x == next(it)

def has22(nums):
    it = iter(nums)
    return any(_has22_iter(it))

your generator raises StopIteration in the following conditions:

  • the generator function reaches its end
  • there is a return statement somewhere
  • there is a raise StopIteration somewhere

Here, you have the third condition, so the generator is terminated.

Compare with the following:

def testgen(x):
    if x == 0:
        next(iter([])) # implicitly raise
    if x == 1:
        raise StopIteration
    if x == 2:
        return

and do

list(testgen(0)) # --> []
list(testgen(1)) # --> []
list(testgen(2)) # --> []
list(testgen(3)) # --> []

you get the same behaviour in all cases.

like image 22
glglgl Avatar answered Sep 20 '22 16:09

glglgl