I'm trying to learn Python, and I started to play with some code:
a = [3,4,5,6,7]
for b in a:
print(a)
a.pop(0)
And the output is:
[3, 4, 5, 6, 7]
[4, 5, 6, 7]
[5, 6, 7]
I know that's not a good practice change data structures while I'm looping on it, but I want to understand how Python manage the iterators in this case.
The principal question is: How does it know that it has to finish the loop if I'm changing the state of a
?
To carry out the iteration this for loop describes, Python does the following: Calls iter() to obtain an iterator for l. Calls next() repeatedly to obtain each item from the iterator in turn. Terminates the loop when next() raises the StopIteration exception.
for loops are used when you have a block of code which you want to repeat a fixed number of times. The for-loop is always used in combination with an iterable object, like a list or a range. The Python for statement iterates over the members of a sequence in order, executing the block each time.
In Python, Loops are used to iterate repeatedly over a block of code. In order to change the way a loop is executed from its usual behavior, control statements are used. Control statements are used to control the flow of the execution of the loop based on a condition.
The 'foreach' loop works with arrays only, with the advantage that a loop counter wouldn't need to be initialized. In addition to this, no condition needs to be set that would be needed to exit out of the loop. The 'foreach' loop implicitly does this too.
kjaquier and Felix have talked about the iterator protocol, and we can see it in action in your case:
>>> L = [1, 2, 3]
>>> iterator = iter(L)
>>> iterator
<list_iterator object at 0x101231f28>
>>> next(iterator)
1
>>> L.pop()
3
>>> L
[1, 2]
>>> next(iterator)
2
>>> next(iterator)
Traceback (most recent call last):
File "<input>", line 1, in <module>
StopIteration
From this we can infer that list_iterator.__next__
has code that behaves something like:
if self.i < len(self.list):
return self.list[i]
raise StopIteration
It does not naively get the item. That would raise an IndexError
which would bubble to the top:
class FakeList(object):
def __iter__(self):
return self
def __next__(self):
raise IndexError
for i in FakeList(): # Raises `IndexError` immediately with a traceback and all
print(i)
Indeed, looking at listiter_next
in the CPython source (thanks Brian Rodriguez):
if (it->it_index < PyList_GET_SIZE(seq)) {
item = PyList_GET_ITEM(seq, it->it_index);
++it->it_index;
Py_INCREF(item);
return item;
}
Py_DECREF(seq);
it->it_seq = NULL;
return NULL;
Although I don't know how return NULL;
eventually translates into a StopIteration
.
The reason why you shouldn't do that is precisely so you don't have to rely on how the iteration is implemented.
But back to the question. Lists in Python are array lists. They represent a continuous chunk of allocated memory, as opposed to linked lists in which each element in allocated independently. Thus, Python's lists, like arrays in C, are optimized for random access. In other words, the most efficient way to get from element n to element n+1 is by accessing to the element n+1 directly (by calling mylist.__getitem__(n+1)
or mylist[n+1]
).
So, the implementation of __next__
(the method called on each iteration) for lists is just like you would expect: the index of the current element is first set at 0 and then increased after each iteration.
In your code, if you also print b
, you will see that happening:
a = [3,4,5,6,7]
for b in a:
print a, b
a.pop(0)
Result :
[3, 4, 5, 6, 7] 3
[4, 5, 6, 7] 5
[5, 6, 7] 7
Because :
a[0] == 3
.a[1] == 5
.a[2] == 7
.len(a) < 3
)If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With