I am trying to clarify for myself Python's rules for 'assigning' values to variables.
Is the following comparison between Python and C++ valid?
In C/C++ the statement int a=7
means, memory is allocated for an integer variable called a
(the quantity on the LEFT of the =
sign)
and only then the value 7 is stored in it.
In Python the statement a=7
means, a nameless integer object with value 7 (the quantity on the RIGHT side of the =
) is created first and stored somewhere in memory. Then the name a
is bound to this object.
The output of the following C++ and Python programs seem to bear this out, but I would like some feedback whether I am right.
C++ produces different memory locations for a
and b
while a
and b
seem to refer to the same location in Python
(going by the output of the id() function)
C++ code
#include<iostream>
using namespace std;
int main(void)
{
int a = 7;
int b = a;
cout << &a << " " << &b << endl; // a and b point to different locations in memory
return 0;
}
Output: 0x7ffff843ecb8 0x7ffff843ecbc
Python: code
a = 7
b = a
print id(a), ' ' , id(b) # a and b seem to refer to the same location
Output: 23093448 23093448
Yes, you're basically correct. In Python, a variable name can be thought of as a binding to a value. This is one of those "a ha" moments people tend to experience when they truly start to grok (deeply understand) Python.
Assigning to a variable name in Python makes the name bind to a different value from what it currently was bound to (if indeed it was already bound), rather than changing the value it currently binds to:
a = 7 # Create 7, bind a to it.
# a -> 7
b = a # Bind b to the thing a is currently bound to.
# a
# \
# *-> 7
# /
# b
a = 42 # Create 42, bind a to it, b still bound to 7.
# a -> 42
# b -> 7
I say "create" but that's not necessarily so - if a value already exists somewhere, it may be re-used.
Where the underlying data is immutable (cannot be changed), that usually makes Python look as if it's behaving identically to the way other languages do (C and C++ come to mind). That's because the 7
(the actual object that the names are bound to) cannot be changed.
But, for mutable data (same as using pointers in C or references in C++), people can sometimes be surprised because they don't realise that the value behind it is shared:
>>> a = [1,2,3] # a -> [1,2,3]
>>> print(a)
[1, 2, 3]
>>> b = a # a,b -> [1,2,3]
>>> print(b)
[1, 2, 3]
>>> a[1] = 42 # a,b -> [1,42,3]
>>> print(a) ; print(b)
[1, 42, 3]
[1, 42, 3]
You need to understand that a[1] = 42
is different to a = [1, 42, 3]
. The latter is an assignment, which would result in a
being re-bound to a different object, and therefore independent of b
.
The former is simply changing the mutable data that both a
and b
are bound to, which is why it affects both.
There are ways to get independent copies of a mutable value, with things such as:
b = a[:]
b = [item for item in a]
b = list(a)
These will work to one level (b = a
can be thought of as working to zero levels) meaning if the a
list contains other mutable things, those will still be shared between a
and b
:
>>> a = [1, [2, 3, 4], 5]
>>> b = a[:]
>>> a[0] = 8 # This is independent.
>>> a[1][1] = 9 # This is still shared.
>>> print(a) ; print(b) # Shared bit will 'leak' between a and b.
[8, [2, 9, 4], 5]
[1, [2, 9, 4], 5]
For a truly independent copy, you can use deepcopy
, which will work down to as many levels as needed to separate the two objects.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With