From what I understand, a for x in a_generator: foo(x)
loop in Python is roughly equivalent to this:
try:
while True:
foo(next(a_generator))
except StopIteration:
pass
That suggests that something like this:
for outer_item in a_generator:
if should_inner_loop(outer_item):
for inner_item in a_generator:
foo(inner_item)
if stop_inner_loop(inner_item): break
else:
bar(outer_item)
would do two things:
y
until it reaches some x
where should_inner_loop(x)
returns truthy, then loop over it in the inner for
until stop_inner_loop(thing)
returns true. Then, the outer loop resumes where the inner one left off.From my admittedly not very good tests, it seems to perform as above. However, I couldn't find anything in the spec guaranteeing that this behavior is constant across interpreters. Is there anywhere that says or implies that I can be sure it will always be like this? Can it cause errors, or perform in some other way? (i.e. do something other than what's described above
N.B. The code equivalent above is taken from my own experience; I don't know if it's actually accurate. That's why I'm asking.
Simply speaking, a generator is a function that returns an object (iterator) which we can iterate over (one value at a time).
A generator is a construct in Python that allows for lazy or ad hoc loading of a stream of data. They can work like a list and be looped over, but generators have the ability to maintain state. Looking at the function above, you might be seeing an unfamiliar keyword called yield . This is similar to return .
This is because generators, like all iterators, can be exhausted. Unless your generator is infinite, you can iterate through it one time only. Once all values have been evaluated, iteration will stop and the for loop will exit. If you used next() , then instead you'll get an explicit StopIteration exception.
This generator uses an iterator, because the "for" loop is implemented using an iterator. If you time these, the generator is consistently faster.
TL;DR: it is safe with CPython (but I could not find any specification of this), although it may not do what you want to do.
First, let's talk about your first assumption, the equivalence.
A for loop actually calls first iter()
on the object, then runs next()
on its result, until it gets a StopIteration
.
Here is the relevant bytecode (a low level form of Python, used by the interpreter itself):
>>> import dis
>>> def f():
... for x in y:
... print(x)
...
>>> dis.dis(f)
2 0 SETUP_LOOP 24 (to 27)
3 LOAD_GLOBAL 0 (y)
6 GET_ITER
>> 7 FOR_ITER 16 (to 26)
10 STORE_FAST 0 (x)
3 13 LOAD_GLOBAL 1 (print)
16 LOAD_FAST 0 (x)
19 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
22 POP_TOP
23 JUMP_ABSOLUTE 7
>> 26 POP_BLOCK
>> 27 LOAD_CONST 0 (None)
30 RETURN_VALUE
GET_ITER
calls iter(y)
(which itself calls y.__iter__()
) and pushes its result on the stack (think of it as a bunch of local unnamed variables), then enters the loop at FOR_ITER
, which calls next(<iterator>)
(which itself calls <iterator>.__next__()
), then executes the code inside the loop, and the JUMP_ABSOLUTE
makes the execution comes back to FOR_ITER
.
Now, for the safety:
Here are the methods of a generator: https://hg.python.org/cpython/file/101404/Objects/genobject.c#l589
As you can see at line 617, the implementation of __iter__()
is PyObject_SelfIter
, whose implementation you can find here. PyObject_SelfIter
simply returns the object (ie. the generator) itself.
So, when you nest the two loops, both iterate on the same iterator.
And, as you said, they are just calling next()
on it, so it's safe.
But be cautious: the inner loop will consume items that will not be consumed by the outer loop. Even if that is what you want to do, it may not be very readable.
If that is not what you want to do, consider itertools.tee()
, which buffers the output of an iterator, allowing you to iterate over its output twice (or more). This is only efficient if the tee iterators stay close to each other in the output stream; if one tee iterator will be fully exhausted before the other is used, it's better to just call list
on the iterator to materialize a list out of it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With