Let it
be an iterable element in python.
In what cases is a change of it
inside a loop over it
reflected? Or more straightforward: When does something like this work?
it = range(6)
for i in it:
it.remove(i+1)
print i
Leads to 0,2,4 being printed (showing the loop runs 3 times).
On the other hand does
it = range(6)
for i in it:
it = it[:-2]
print it
lead to the output:
[0,1,2,3]
[0,1]
[]
[]
[]
[],
showing the loop runs 6 times. I guess it has something to do with in-place operations or variable scope but cannot wrap my head around it 100% sure.
Clearification:
One example, that doesn't work:
it = range(6)
for i in it:
it = it.remove(i+1)
print it
leads to 'None' being printed and an Error (NoneType has no attribute 'remove') to be thrown.
Unlike if statements, the condition in a while loop must eventually become False. If this doesn't happen, the while loop will keep going forever! The best way to make the condition change from True to False is to use a variable as part of the Boolean expression. We can then change the variable inside the while loop.
If you put i = 4 then you change i within the step of the current iteration. After the second iteration it goes on as expected with 3. If you wnat a different behaviour don't use range instead use a while loop. If you are using a for loop, you probably shouldn't change the index in multiple places like that.
In short: you cannot directly modify the i in for i in because of how this code works under the covers in Python. So while should be used for a use case where you need to change your counter inside a loop (in Python).
To avoid this problem, a simple solution is to iterate over a copy of the list. For example, you'll obtain a copy of list_1 by using the slice notation with default values list_1[:] . Because you iterate over a copy of the list, you can modify the original list without damaging the iterator.
When you iterate over a list
you actually call list.__iter__()
, which returns a listiterator
object bound to the list
, and then actually iterate over this listiterator
. Technically, this:
itt = [1, 2, 3]
for i in itt:
print i
is actually kind of syntactic sugar for:
itt = [1, 2, 3]
iterator = iter(itt)
while True:
try:
i = it.next()
except StopIteration:
break
print i
So at this point - within the loop -, rebinding itt
doesn't impact the listiterator
(which keeps it's own reference to the list), but mutating itt
will obviously impact it (since both references point to the same list).
IOW it's the same old difference between rebinding and mutating... You'd get the same behaviour without the for
loop:
# creates a `list` and binds it to name "a"
a = [1, 2, 3]
# get the object bound to name "a" and binds it to name "b" too.
# at this point "a" and "b" both refer to the same `list` instance
b = a
print id(a), id(b)
print a is b
# so if we mutate "a" - actually "mutate the object bound to name 'a'" -
# we can see the effect using any name refering to this object:
a.append(42)
print b
# now we rebind "a" - make it refer to another object
a = ["a", "b", "c"]
# at this point, "b" still refer to the first list, and
# "a" refers to the new ["a", "b", "c"] list
print id(a), id(b)
print a is b
# and of course if we now mutate "a", it won't reflect on "b"
a.pop()
print a
print b
In the first loop you are changing the it
object (inner state of the object), however, in the second loop you are reassigning the it
to another object, leaving initial object unchanged.
Let's take a look at the generated bytecode:
In [2]: def f1():
...: it = range(6)
...: for i in it:
...: it.remove(i + 1)
...: print i
...:
In [3]: def f2():
...: it = range(6)
...: for i in it:
...: it = it[:-2]
...: print it
...:
In [4]: import dis
In [5]: dis.dis(f1)
2 0 LOAD_GLOBAL 0 (range)
3 LOAD_CONST 1 (6)
6 CALL_FUNCTION 1
9 STORE_FAST 0 (it)
3 12 SETUP_LOOP 36 (to 51)
15 LOAD_FAST 0 (it)
18 GET_ITER
>> 19 FOR_ITER 28 (to 50)
22 STORE_FAST 1 (i)
4 25 LOAD_FAST 0 (it)
28 LOAD_ATTR 1 (remove)
31 LOAD_FAST 1 (i)
34 LOAD_CONST 2 (1)
37 BINARY_ADD
38 CALL_FUNCTION 1
41 POP_TOP
5 42 LOAD_FAST 1 (i)
45 PRINT_ITEM
46 PRINT_NEWLINE
47 JUMP_ABSOLUTE 19
>> 50 POP_BLOCK
>> 51 LOAD_CONST 0 (None)
54 RETURN_VALUE
In [6]: dis.dis(f2)
2 0 LOAD_GLOBAL 0 (range)
3 LOAD_CONST 1 (6)
6 CALL_FUNCTION 1
9 STORE_FAST 0 (it)
3 12 SETUP_LOOP 29 (to 44)
15 LOAD_FAST 0 (it)
18 GET_ITER
>> 19 FOR_ITER 21 (to 43)
22 STORE_FAST 1 (i)
4 25 LOAD_FAST 0 (it)
28 LOAD_CONST 2 (-2)
31 SLICE+2
32 STORE_FAST 0 (it)
5 35 LOAD_FAST 0 (it)
38 PRINT_ITEM
39 PRINT_NEWLINE
40 JUMP_ABSOLUTE 19
>> 43 POP_BLOCK
>> 44 LOAD_CONST 0 (None)
As you can see, for
statement works with an iterable of it
(GET_ITER
instruction, iter(it)
). Therefore, reassigning the it
variable will not affect the loop iteration.
First, it is essential to understand what happens under the hood when you run a simple for-loop, like:
for i in it: pass
At the beginning of the loop, an iterator is created. That iterator is the result of an implicit call to iter(it)
. This is the only time the variable named it
is referenced in the above loop. The rest of the references happen when next
is called on that iterator, but it uses the object the iterator keeps a reference to, not the object the name it
is bound to.
What does this mean for your second example?
Note that in your second example, you do not change the list inplace, but create a new list and bind the variable it
to it.
It means the iterator keeps referencing the original list, which is unchanged.
In your first example, you change the original list in place, therefor calls to next(iterator)
reflect those changes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With