I was trying to implement the reverse function of itertools.izip on Python 2.7.1. The thing is that I find a problem, and I don't have an explantion. Solution 1, iunzip_v1 works perfectly. But solution 2. iunzip_v2, doesn't works as expected. Til now, I haven't found any relevant information about this problem, and reading the PEP about generators, it sound it should work, but it doesn't.
import itertools
from operator import itemgetter
def iunzip_v1(iterable):
_tmp, iterable = itertools.tee(iterable, 2)
iters = itertools.tee(iterable, len(_tmp.next()))
return tuple(itertools.imap(itemgetter(i), it) for i, it in enumerate(iters))
def iunzip_v2(iterable):
_tmp, iterable = itertools.tee(iterable, 2)
iters = itertools.tee(iterable, len(_tmp.next()))
return tuple((elem[i] for elem in it) for i, it in enumerate(iters))
result:
In [17]: l
Out[17]: [(0, 0, 0), (1, 2, 3), (2, 4, 6), (3, 6, 9), (4, 8, 12)]
In [18]: map(list, iunzip.iunzip_v1(l))
Out[18]: [[0, 1, 2, 3, 4], [0, 2, 4, 6, 8], [0, 3, 6, 9, 12]]
In [19]: map(list, iunzip.iunzip_v2(l))
Out[19]: [[0, 3, 6, 9, 12], [0, 3, 6, 9, 12], [0, 3, 6, 9, 12]]
Seems that iunzip_v2 is using the last value, so the generators aren't keeping the value while they are created inside the first generator. I'm missing something and I don't know what is.
Thanks in advance if something can clarify me this situation.
UPDATE: I've found the explanation here PEP-289, my first read was at PEP-255. The solution I'm trying to implement is a lazy one, so:
zip(*iter) or izip(*...)
doesn't work for me, because *arg expand the argument list.
You're reinventing the wheel in a crazy way. izip
is its own inverse:
>>> list(izip(*izip(range(10), range(10))))
[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]
But that doesn't quite answer your question, does it?
The problem with your nested generators is a scoping problem that happens because the innermost generators don't get used until the outermost generator has already run:
def iunzip_v2(iterable):
_tmp, iterable = itertools.tee(iterable, 2)
iters = itertools.tee(iterable, len(_tmp.next()))
return tuple((elem[i] for elem in it) for i, it in enumerate(iters))
Here, you generate three generators, each of which uses the same variable, i
. Copies of this variable are not made. Then, tuple
exhausts the outermost generator, creating a tuple of generators:
>>> iunzip_v2((range(3), range(3)))
(<generator object <genexpr> at 0x1004d4a50>, <generator object <genexpr> at 0x1004d4aa0>, <generator object <genexpr> at 0x1004d4af0>)
At this point, each of these generators will execute elem[i]
for each element of it
. And since i
is now equal to 3 for all three generators, you get the last element each time.
The reason the first version works is that itemgetter(i)
is a closure, with its own scope -- so every time it returns a function, it generates a new scope, within which the value of i
does not change.
Ok this is a bit tricky. When you use a name like i
the value it stands for is looked up just during runtime. In this code:
return tuple((elem[i] for elem in it) for i, it in enumerate(iters))
you return a number of generators, (elem[i] for elem in it)
and each of them uses the same name i
. When the function returns, the loop in tuple( .. for i in .. )
has ended and i
has been set to it's final value (3
in your example). Once you evaluate these generators to lists, they all create the same values because they are using the same i
.
Btw:
unzip = lambda zipped: zip(*zipped)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With