Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding python's name binding

I am trying to clarify for myself Python's rules for 'assigning' values to variables.

Is the following comparison between Python and C++ valid?

  1. In C/C++ the statement int a=7 means, memory is allocated for an integer variable called a (the quantity on the LEFT of the = sign) and only then the value 7 is stored in it.

  2. In Python the statement a=7 means, a nameless integer object with value 7 (the quantity on the RIGHT side of the =) is created first and stored somewhere in memory. Then the name a is bound to this object.

The output of the following C++ and Python programs seem to bear this out, but I would like some feedback whether I am right.

C++ produces different memory locations for a and b while a and b seem to refer to the same location in Python (going by the output of the id() function)

C++ code

#include<iostream>
using namespace std;
int main(void)
{
  int a = 7;
  int b = a; 
  cout << &a <<  "  " << &b << endl; // a and b point to different locations in memory
  return 0;
}

Output: 0x7ffff843ecb8 0x7ffff843ecbc

Python: code

a = 7
b = a
print id(a), ' ' , id(b) # a and b seem to refer to the same location

Output: 23093448 23093448

like image 734
smilingbuddha Avatar asked Jan 30 '15 02:01

smilingbuddha


1 Answers

Yes, you're basically correct. In Python, a variable name can be thought of as a binding to a value. This is one of those "a ha" moments people tend to experience when they truly start to grok (deeply understand) Python.

Assigning to a variable name in Python makes the name bind to a different value from what it currently was bound to (if indeed it was already bound), rather than changing the value it currently binds to:

a = 7   # Create 7, bind a to it.
        #     a -> 7

b = a   # Bind b to the thing a is currently bound to.
        #     a
        #      \
        #       *-> 7
        #      /
        #     b

a = 42  # Create 42, bind a to it, b still bound to 7.
        #     a -> 42
        #     b -> 7

I say "create" but that's not necessarily so - if a value already exists somewhere, it may be re-used.

Where the underlying data is immutable (cannot be changed), that usually makes Python look as if it's behaving identically to the way other languages do (C and C++ come to mind). That's because the 7 (the actual object that the names are bound to) cannot be changed.

But, for mutable data (same as using pointers in C or references in C++), people can sometimes be surprised because they don't realise that the value behind it is shared:

>>> a = [1,2,3]     # a -> [1,2,3]
>>> print(a)
[1, 2, 3]

>>> b = a           # a,b -> [1,2,3]
>>> print(b)
[1, 2, 3]

>>> a[1] = 42       # a,b -> [1,42,3]
>>> print(a) ; print(b)
[1, 42, 3]
[1, 42, 3]

You need to understand that a[1] = 42 is different to a = [1, 42, 3]. The latter is an assignment, which would result in a being re-bound to a different object, and therefore independent of b.

The former is simply changing the mutable data that both a and b are bound to, which is why it affects both.

There are ways to get independent copies of a mutable value, with things such as:

b = a[:]
b = [item for item in a]
b = list(a)

These will work to one level (b = a can be thought of as working to zero levels) meaning if the a list contains other mutable things, those will still be shared between a and b:

>>> a = [1, [2, 3, 4], 5]
>>> b = a[:]
>>> a[0] = 8             # This is independent.
>>> a[1][1] = 9          # This is still shared.
>>> print(a) ; print(b)  # Shared bit will 'leak' between a and b.
[8, [2, 9, 4], 5]
[1, [2, 9, 4], 5]

For a truly independent copy, you can use deepcopy, which will work down to as many levels as needed to separate the two objects.

like image 169
paxdiablo Avatar answered Sep 29 '22 13:09

paxdiablo