Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

(list|set|dict) comprehension containing a yield expression does not return a (list|set|dict)

Python 3.3

I've constructed this slightly cryptic piece of python 3.3:

>>> [(yield from (i, i + 1, i)) for i in range(5)]
<generator object <listcomp> at 0x0000008666D96900>
>>> list(_)
[0, 1, 0, 1, 2, 1, 2, 3, 2, 3, 4, 3, 4, 5, 4]

If I use a generator comprehension inside a list constructor, I get a different result:

>>> list((yield from (i, i + 1, i)) for i in range(5))
[0, 1, 0, None, 1, 2, 1, None, 2, 3, 2, None, 3, 4, 3, None, 4, 5, 4, None]

Why isn't the list comprehension returning a list?

Python 2.7

I can get a similarly odd effect in python 2 (using a set comprehension, because list comprehensions have odd scope):

>>> {(yield i) for i in range(5)}
<generator object <setcomp> at 0x0000000004A06120>
>>> list(_)
[0, 1, 2, 3, 4, {None}]

And when using a generator comprehension:

>>> list((yield i) for i in range(5))
[0, None, 1, None, 2, None, 3, None, 4, None]

Where'd that {None} come from?

like image 379
Eric Avatar asked Jan 07 '14 14:01

Eric


People also ask

Can you use yield in list comprehension?

The yield expression is only used when defining a generator function and thus can only be used in the body of a function definition.

Can you return a list comprehension Python?

List comprehensions cannot contain statements, only expressions. In fact, that's true for all expressions in Python: they can only contain other expressions. So, no, you can't put a return inside a list comprehension.

Can list comprehension return two values?

2 List Comprehension. This has two drawbacks: You can only look for one value at a time. It only returns the index of the first occurrence of a value; if there are duplicates, you won't know about them.

What is the difference between a list generator and list comprehension?

So what's the difference between Generator Expressions and List Comprehensions? The generator yields one item at a time and generates item only when in demand. Whereas, in a list comprehension, Python reserves memory for the whole list. Thus we can say that the generator expressions are memory efficient than the lists.


2 Answers

Using this as a reference:

Python 3 explanation

This:

values = [(yield from (i, i + 1, i)) for i in range(5)]

Translates to the following in Python 3.x:

def _tmpfunc(): 
    _tmp = [] 
    for x in range(5): 
        _tmp.append(yield from (i, i + 1, i)) 
    return _tmp 
values = _tmpfunc()

Which results in values containing a generator

That generator will then yield from each (i, i + 1, i), until finally reaching the return statement. In python 3, this will throw StopIteration(_tmp) - however, this exception is ignored by the list constructor.


On the other hand, this:

list((yield from (i, i + 1, i)) for i in range(5))

Translates to the following in Python 3.x:

def _tmpfunc():
    for x in range(5): 
        yield (yield from (i, i + 1, i))

values = list(_tmpfunc())

This time, every time the yield from completes, it evaluates to None, which is then yielded amidst the other values.

like image 194
Eric Avatar answered Sep 28 '22 12:09

Eric


List (set, dict) comprehensions translate to a different code structure from generator expressions. Let's look at a set comprehension:

def f():
    return {i for i in range(10)}

dis.dis(f.__code__.co_consts[1])
  2           0 BUILD_SET                0
              3 LOAD_FAST                0 (.0)
        >>    6 FOR_ITER                12 (to 21)
              9 STORE_FAST               1 (i)
             12 LOAD_FAST                1 (i)
             15 SET_ADD                  2
             18 JUMP_ABSOLUTE            6
        >>   21 RETURN_VALUE        

Compare to the equivalent generator expression:

def g():
    return (i for i in range(10))

dis.dis(g.__code__.co_consts[1])
  2           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                11 (to 17)
              6 STORE_FAST               1 (i)
              9 LOAD_FAST                1 (i)
             12 YIELD_VALUE         
             13 POP_TOP             
             14 JUMP_ABSOLUTE            3
        >>   17 LOAD_CONST               0 (None)
             20 RETURN_VALUE        

You'll notice that where the generator expression has a yield, the set comprehension stores a value directly into the set it is building.

This means that if you add a yield expression into the body of a generator expression, it is treated indistinguishably from the yield that the language constructs for the generator body; as a result, you get two (or more) values per iteration.

However, if you add a yield to a list (set, dict) comprehension then the comprehension is transformed from a function building a list (set, dict) into a generator that executes the yield statements then returns the constructed list (set, dict). The {None} in the set comprehension result is the set built from each of the Nones that the yield expressions evaluate to.


Finally, why does Python 3.3 not produce a {None}? (Note that previous versions of Python 3 do.) It's because of PEP 380 (a.k.a. yield from support). Prior to Python 3.3, a return in a generator is a SyntaxError: 'return' with argument inside generator; our yielding comprehensions are therefore exploiting undefined behaviour, but the actual result of the RETURN_VALUE opcode is to just generate another (final) value from the generator. In Python 3.3, return value is explicitly supported; a RETURN_VALUE opcode results in a StopIteration being raised, which has the effect of stopping the generator without producing a final value.

like image 38
ecatmur Avatar answered Sep 28 '22 12:09

ecatmur