Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can I use the same name for iterator and sequence in a Python for loop?

This is more of a conceptual question. I recently saw a piece of code in Python (it worked in 2.7, and it might also have been run in 2.5 as well) in which a for loop used the same name for both the list that was being iterated over and the item in the list, which strikes me as both bad practice and something that should not work at all.

For example:

x = [1,2,3,4,5]
for x in x:
    print x
print x

Yields:

1
2
3
4
5
5

Now, it makes sense to me that the last value printed would be the last value assigned to x from the loop, but I fail to understand why you'd be able to use the same variable name for both your parts of the for loop and have it function as intended. Are they in different scopes? What's going on under the hood that allows something like this to work?

like image 605
Gustav Avatar asked Jul 11 '14 04:07

Gustav


People also ask

Can iterator be used in for loop?

An Iterator is an object that can be used to loop through collections, like ArrayList and HashSet. It is called an "iterator" because "iterating" is the technical term for looping. To use an Iterator, you must import it from the java.

What is the difference between iterator and for loop in Python?

Python3. Here iter( ) is converting s which is a string (iterable) into an iterator and prints G for the first time we can call multiple times to iterate over strings. When a for loop is executed, for statement calls iter() on the object, which it is supposed to loop over.

Does for loop in Python work on iteration?

A for loop is used for iterating over a sequence (that is either a list, a tuple, a dictionary, a set, or a string). This is less like the for keyword in other programming languages, and works more like an iterator method as found in other object-orientated programming languages.

Is iterator the same as loop?

Both loops and iterators are used to repeat a chunk of code. Loops are an ancient idea, they existed long before computers did. Almost every programming language has some kind of loops. Iterators are relatively new, and they only exist in a few languages such as Ruby.


6 Answers

What does dis tell us:

Python 3.4.1 (default, May 19 2014, 13:10:29)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from dis import dis
>>> dis("""x = [1,2,3,4,5]
... for x in x:
...     print(x)
... print(x)""")

  1           0 LOAD_CONST               0 (1)
              3 LOAD_CONST               1 (2)
              6 LOAD_CONST               2 (3)
              9 LOAD_CONST               3 (4)
             12 LOAD_CONST               4 (5)
             15 BUILD_LIST               5
             18 STORE_NAME               0 (x)

  2          21 SETUP_LOOP              24 (to 48)
             24 LOAD_NAME                0 (x)
             27 GET_ITER
        >>   28 FOR_ITER                16 (to 47)
             31 STORE_NAME               0 (x)

  3          34 LOAD_NAME                1 (print)
             37 LOAD_NAME                0 (x)
             40 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             43 POP_TOP
             44 JUMP_ABSOLUTE           28
        >>   47 POP_BLOCK

  4     >>   48 LOAD_NAME                1 (print)
             51 LOAD_NAME                0 (x)
             54 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             57 POP_TOP
             58 LOAD_CONST               5 (None)
             61 RETURN_VALUE

The key bits are sections 2 and 3 - we load the value out of x (24 LOAD_NAME 0 (x)) and then we get its iterator (27 GET_ITER) and start iterating over it (28 FOR_ITER). Python never goes back to load the iterator again.

Aside: It wouldn't make any sense to do so, since it already has the iterator, and as Abhijit points out in his answer, Section 7.3 of Python's specification actually requires this behavior).

When the name x gets overwritten to point at each value inside of the list formerly known as x Python doesn't have any problems finding the iterator because it never needs to look at the name x again to finish the iteration protocol.

like image 68
Sean Vieira Avatar answered Oct 17 '22 14:10

Sean Vieira


Using your example code as the core reference

x = [1,2,3,4,5]
for x in x:
    print x
print x

I would like you to refer the section 7.3. The for statement in the manual

Excerpt 1

The expression list is evaluated once; it should yield an iterable object. An iterator is created for the result of the expression_list.

What it means is that your variable x, which is a symbolic name of an object list : [1,2,3,4,5] is evaluated to an iterable object. Even if the variable, the symbolic reference changes its allegiance, as the expression-list is not evaluated again, there is no impact to the iterable object that has already been evaluated and generated.

Note

  • Everything in Python is an Object, has an Identifier, attributes and methods.
  • Variables are Symbolic name, a reference to one and only one object at any given instance.
  • Variables at run-time can change its allegiance i.e. can refer to some other object.

Excerpt 2

The suite is then executed once for each item provided by the iterator, in the order of ascending indices.

Here the suite refers to the iterator and not to the expression-list. So, for each iteration, the iterator is executed to yield the next item instead of referring to the original expression-list.

like image 31
Abhijit Avatar answered Oct 17 '22 12:10

Abhijit


It is necessary for it to work this way, if you think about it. The expression for the sequence of a for loop could be anything:

binaryfile = open("file", "rb")
for byte in binaryfile.read(5):
    ...

We can't query the sequence on each pass through the loop, or here we'd end up reading from the next batch of 5 bytes the second time. Naturally Python must in some way store the result of the expression privately before the loop begins.


Are they in different scopes?

No. To confirm this you could keep a reference to the original scope dictionary (locals()) and notice that you are in fact using the same variables inside the loop:

x = [1,2,3,4,5]
loc = locals()
for x in x:
    print locals() is loc  # True
    print loc["x"]  # 1
    break

What's going on under the hood that allows something like this to work?

Sean Vieira showed exactly what is going on under the hood, but to describe it in more readable python code, your for loop is essentially equivalent to this while loop:

it = iter(x)
while True:
    try:
        x = it.next()
    except StopIteration:
        break
    print x

This is different from the traditional indexing approach to iteration you would see in older versions of Java, for example:

for (int index = 0; index < x.length; index++) {
    x = x[index];
    ...
 }

This approach would fail when the item variable and the sequence variable are the same, because the sequence x would no longer be available to look up the next index after the first time x was reassigned to the first item.

With the former approach, however, the first line (it = iter(x)) requests an iterator object which is what is actually responsible for providing the next item from then on. The sequence that x originally pointed to no longer needs to be accessed directly.

like image 45
nmclean Avatar answered Oct 17 '22 14:10

nmclean


It's the difference between a variable (x) and the object it points to (the list). When the for loop starts, Python grabs an internal reference to the object pointed to by x. It uses the object and not what x happens to reference at any given time.

If you reassign x, the for loop doesn't change. If x points to a mutable object (e.g., a list) and you change that object (e.g., delete an element) results can be unpredictable.

like image 32
tdelaney Avatar answered Oct 17 '22 12:10

tdelaney


Basically, the for loop takes in the list x, and then, storing that as a temporary variable, reassigns a x to each value in that temporary variable. Thus, x is now the last value in the list.

>>> x = [1, 2, 3]
>>> [x for x in x]
[1, 2, 3]
>>> x
3
>>> 

Just like in this:

>>> def foo(bar):
...     return bar
... 
>>> x = [1, 2, 3]
>>> for x in foo(x):
...     print x
... 
1
2
3
>>> 

In this example, x is stored in foo() as bar, so although x is being reassigned, it still exist(ed) in foo() so that we could use it to trigger our for loop.

like image 43
ZenOfPython Avatar answered Oct 17 '22 14:10

ZenOfPython


x no longer refers to the original x list, and so there's no confusion. Basically, python remembers it's iterating over the original x list, but as soon as you start assigning the iteration value (0,1,2, etc) to the name x, it no longer refers to the original x list. The name gets reassigned to the iteration value.

In [1]: x = range(5)

In [2]: x
Out[2]: [0, 1, 2, 3, 4]

In [3]: id(x)
Out[3]: 4371091680

In [4]: for x in x:
   ...:     print id(x), x
   ...:     
140470424504688 0
140470424504664 1
140470424504640 2
140470424504616 3
140470424504592 4

In [5]: id(x)
Out[5]: 140470424504592
like image 41
Noah Avatar answered Oct 17 '22 12:10

Noah