I am experimenting with 2 functions that emulate the zip
built-in in Python 2.x and 3.x. The first one returns a list (as in Python 2.x) and the second one is a generator function which returns one piece of its result set at a time (as in Python 3.x):
def myzip_2x(*seqs):
its = [iter(seq) for seq in seqs]
res = []
while True:
try:
res.append(tuple([next(it) for it in its])) # Or use generator expression?
# res.append(tuple(next(it) for it in its))
except StopIteration:
break
return res
def myzip_3x(*seqs):
its = [iter(seq) for seq in seqs]
while True:
try:
yield tuple([next(it) for it in its]) # Or use generator expression?
# yield tuple(next(it) for it in its)
except StopIteration:
return
print(myzip_2x('abc', 'xyz123'))
print(list(myzip_3x([1, 2, 3, 4, 5], [7, 8, 9])))
This works well and gives the expected output of the zip
built-in:
[('a', 'x'), ('b', 'y'), ('c', 'z')]
[(1, 7), (2, 8), (3, 9)]
Then I thought about replacing the list comprehension within the tuple()
calls with its (almost) equivalent generator expression, by deleting the square brackets []
(why create a temporary list using the comprehension when the generator should be fine for the iterable expected by tuple()
, right?)
However, this causes Python to hang. If the execution is not terminated using Ctrl C (in IDLE on Windows), it will eventually stop after several minutes with an (expected) MemoryError
exception.
Debugging the code (using PyScripter for example) revealed that the StopIteration
exception is never raised when the generator expression is used. The first example call above to myzip_2x()
keeps on adding empty tuples to res
, while the second example call to myzip_3x()
yields the tuples (1, 7)
, (2, 8)
, (3, 9)
, (4,)
, (5,)
, ()
, ()
, ()
, ...
.
Am I missing something?
And a final note: the same hanging behaviour appears if its
becomes a generator (using its = (iter(seq) for seq in seqs)
) in the first line of each function (when list comprehensions are used in the tuple()
call).
Thanks @Blckknght for the explanation, you were right. This message gives more details on what is happening using a similar example to the generator function above. In conclusion, using generator expressions like so only works in Python 3.5+ and it requires the from __future__ import generator_stop
statement at the top of the file and changing StopIteration
with RuntimeError
above (again, when using generator expressions instead of list comprehensions).
As for the final note above: if its
becomes a generator (using its = (iter(seq) for seq in seqs)
) it will support just one iteration - because generators are one-shot iterators. Therefore it is exhausted the first time the while loop is run and on subsequent loops only empty tuples are obtained.
Introduced with PEP 255, generator functions are a special kind of function that return a lazy iterator. These are objects that you can loop over like a list. However, unlike lists, lazy iterators do not store their contents in memory.
This generator uses an iterator, because the "for" loop is implemented using an iterator. If you time these, the generator is consistently faster.
Generators in python provide an efficient way of generating numbers or objects as and when needed, without having to store all the values in memory beforehand.
Advantages of using GeneratorsMemory is saved as the items are produced when required, unlike normal Python functions. This fact becomes very important when you need to create a huge number of iterators. This is also considered as the biggest advantage of generators. Can be used to produce an infinite number of items.
The behavior you're seeing is a bug. It stems from the fact that a StopIteration
exception bubbling out of a generator is indistinguishable from the generator exiting normally. This means you can't wrap a loop on a generator with a try
and except
and look for StopIteration
to break you out of the loop, as the loop logic will consume the exception.
PEP 479 proposes a fix for the issue, by changing the language to make an uncaught StopIteration
inside a generator turn into a RuntimeError
before bubbling up. This will allow your code to work (with a small tweak to the type of exception you catch).
The PEP has been implemented in Python 3.5, but to preserve backwards compatibility, the changed behavior is only available if you request it by putting from __future__ import generator_stop
at the top of your files. The new behavior will be enabled by default in Python 3.7 (Python 3.6 will default to the old behavior, but it may issue a warning if the situation comes up).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With