Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python nested generators

I was trying to implement the reverse function of itertools.izip on Python 2.7.1. The thing is that I find a problem, and I don't have an explantion. Solution 1, iunzip_v1 works perfectly. But solution 2. iunzip_v2, doesn't works as expected. Til now, I haven't found any relevant information about this problem, and reading the PEP about generators, it sound it should work, but it doesn't.

import itertools
from operator import itemgetter

def iunzip_v1(iterable):
    _tmp, iterable = itertools.tee(iterable, 2)
    iters = itertools.tee(iterable, len(_tmp.next()))
    return tuple(itertools.imap(itemgetter(i), it) for i, it in enumerate(iters))

def iunzip_v2(iterable):
    _tmp, iterable = itertools.tee(iterable, 2)
    iters = itertools.tee(iterable, len(_tmp.next()))
    return tuple((elem[i] for elem in it) for i, it in enumerate(iters))

result:

In [17]: l
Out[17]: [(0, 0, 0), (1, 2, 3), (2, 4, 6), (3, 6, 9), (4, 8, 12)]

In [18]: map(list, iunzip.iunzip_v1(l))
Out[18]: [[0, 1, 2, 3, 4], [0, 2, 4, 6, 8], [0, 3, 6, 9, 12]]

In [19]: map(list, iunzip.iunzip_v2(l))
Out[19]: [[0, 3, 6, 9, 12], [0, 3, 6, 9, 12], [0, 3, 6, 9, 12]]

Seems that iunzip_v2 is using the last value, so the generators aren't keeping the value while they are created inside the first generator. I'm missing something and I don't know what is.

Thanks in advance if something can clarify me this situation.

UPDATE: I've found the explanation here PEP-289, my first read was at PEP-255. The solution I'm trying to implement is a lazy one, so:

  zip(*iter) or izip(*...)

doesn't work for me, because *arg expand the argument list.

like image 686
Andrés Moreira Avatar asked Jun 21 '11 22:06

Andrés Moreira


2 Answers

You're reinventing the wheel in a crazy way. izip is its own inverse:

>>> list(izip(*izip(range(10), range(10))))
[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]

But that doesn't quite answer your question, does it?

The problem with your nested generators is a scoping problem that happens because the innermost generators don't get used until the outermost generator has already run:

def iunzip_v2(iterable):
    _tmp, iterable = itertools.tee(iterable, 2)
    iters = itertools.tee(iterable, len(_tmp.next()))
    return tuple((elem[i] for elem in it) for i, it in enumerate(iters))

Here, you generate three generators, each of which uses the same variable, i. Copies of this variable are not made. Then, tuple exhausts the outermost generator, creating a tuple of generators:

>>> iunzip_v2((range(3), range(3)))
(<generator object <genexpr> at 0x1004d4a50>, <generator object <genexpr> at 0x1004d4aa0>, <generator object <genexpr> at 0x1004d4af0>)

At this point, each of these generators will execute elem[i] for each element of it. And since i is now equal to 3 for all three generators, you get the last element each time.

The reason the first version works is that itemgetter(i) is a closure, with its own scope -- so every time it returns a function, it generates a new scope, within which the value of i does not change.

like image 84
senderle Avatar answered Oct 24 '22 09:10

senderle


Ok this is a bit tricky. When you use a name like i the value it stands for is looked up just during runtime. In this code:

return tuple((elem[i] for elem in it) for i, it in enumerate(iters))

you return a number of generators, (elem[i] for elem in it) and each of them uses the same name i. When the function returns, the loop in tuple( .. for i in .. ) has ended and i has been set to it's final value (3 in your example). Once you evaluate these generators to lists, they all create the same values because they are using the same i.

Btw:

unzip = lambda zipped: zip(*zipped) 
like image 32
Jochen Ritzel Avatar answered Oct 24 '22 10:10

Jochen Ritzel