Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy interfering with namespace

import numpy as np

def f(x):
    x /= 10

data = np.linspace(0, 1, 5)
print data
f(data)
print data

Output on my system (debian 8, Python 2.7.9-1, numpy 1:1.8.2-2)

[ 0. 0.25  0.5   0.75  1.  ]
[ 0. 0.025  0.05   0.075  0.1  ]

Normally I would expect data to stay untouched when passing it to a function as this has its own separate namespace. But when the data is a numpy array the function changes data globally.

Is this a feature, a bug or am I maybe missing something? How should I avoid this behavior when using a custom plot function to scale my data automatically?

UPDATE (See Kevin J. Chase's answer for more details)

import numpy as np

def f(x):
    print id(x)
    x = x/10
    print id(x)

data = np.linspace(0, 1, 5)
print id(data)
print data
f(data)
print data

Output on my system (debian 8, Python 2.7.9-1, numpy 1:1.8.2-2)

48844592
[ 0. 0.25  0.5   0.75  1.  ]
48844592
45972592
[ 0. 0.25  0.5   0.75  1.  ]

Using x = x/10 instead of x /= 10 solves the problem for me.

The behaviour of the nice and short x /= 10 statement actually depends heavily on the type of x. It rebinds if x is immutable and mutates otherwise.

It is not equivalent to x = x/10 which always rebinds.

A numpy array is a mutable object.

like image 690
Felix Avatar asked Apr 30 '26 19:04

Felix


1 Answers

Normally I would expect data to stay untouched when passing it to a function as this has its own separate namespace.

x in the function and data at the module level are two names for the same object. Since that object is mutable, any changes made to it will be "seen" regardless of which name is used to refer to the object. Namespaces can't protect you from that.

x /= 10 divides every element of the NumPy array by 10. The original data is gone after this line executes. If you were to run f(data) a few more times, you'd find the contents draw closer to 0.0 each time.

Lists are a more familiar example of the same effect:

l = list(range(4))
print(l)
# [0, 1, 2, 3]
l += [4]
print(l)
# [0, 1, 2, 3, 4]

For a good overview of this sort of thing (including related issues) I recommend Ned Batchelder's “Facts and Myths about Python Names and Values” (26 minute video from PyCon US 2015). His example of list "addition" starts about 10 minutes in.

Behind the Scenes

/ and /= (and similar pairs of operators) do different things. Tutorials often claim that these two operations are the same:

x = x / 10
x /= 10

...but they're not. Full details can be found in The Python Language Reference, 3.3.7. Emulating Numeric Types.

/ calls the __truediv__ (or maybe __rtruediv__ --- a topic for another day) method on one of the two objects, feeding the other object as the argument:

# x = x / 10
x = x.__truediv__(10)

Typically, these methods return some new value without altering the old one. This is why data was unchanged by x / 10, but id(x) changed --- x now referred to a new object, and was no longer an alias for data.

/= calls a completely different method, __itruediv__ for the "in-place" operation:

# x /= 10
x = x.__itruediv__(10)

These methods typically modify the object, which then returns self. This explains why id(x) was unchanged and why data's contents had changed --- x and data were still the one and only object. From the docs I linked above:

These methods should attempt to do the operation in-place (modifying self) and return the result (which could be, but does not have to be, self). If a specific method is not defined, the augmented assignment falls back to the normal methods [meaning __add__ and family --- KJC].

If you look at the methods of different data types, you'll find that they don't support all of these.

  • dir(0) shows that integers lack the in-place methods, which shouldn't be surprising, because they're immutable.

  • dir([]) reveals only two in-place methods: __iadd__ and __imul__ --- you can't divide or subtract from a list, but you can in-place add another list, and you can multiply it by an integer. (Again, those methods can do whatever they want with their arguments, including refuse them... list.__iadd__ won't take an integer, while list.__imul__ will reject a list.)

  • dir(np.linspace(0, 1, 5)) shows basically all of the arithmetic, logic, and bitwise methods, with normal and in-place for each. (It could be missing some --- I didn't count them all.)

Finally, to re-reiterate, what namespace these objects are in when their methods get called makes absolutely no difference. In Python, data has no scope... if you have a reference to it, you can call methods on it. (From Ned Batchelder's talk: Variables have a scope, but no type; data has a type, but no scope.)

like image 173
Kevin J. Chase Avatar answered May 03 '26 10:05

Kevin J. Chase



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!