I noticed some differences between the operations between x=x+a
and x+=a
when manipulating some numpy arrays in python.
What I was trying to do is simply adding some random errors to an integer list, like this:
x=numpy.arange(12)
a=numpy.random.random(size=12)
x+=a
but printing out x
gives an integer list [0,1,2,3,4,5,6,7,8,9,10,11]
.
It turns out that if I use x=x+a
instead, it works as expected.
Is that something we should be aware of, I mean it behaves so differently. I used to think that it is totally equivalent between x+=a
and x=x+a
and I have been using them interchangeably without paying attention all the time. Now I am so concerned and anxious about all the computations I have done so far. Who knows when and where this has been creating a problem and I have to go through everything to double check.
Is this a bug in numpy? I have tested in numpy version 1.2.0 and 1.6.1 and they both did this.
Data Types in NumPyNumPy has some extra data types, and refer to data types with one character, like i for integers, u for unsigned integers etc. Below is a list of all data types in NumPy and the characters used to represent them.
There are 5 basic numerical types representing booleans (bool), integers (int), unsigned integers (uint) floating point (float) and complex.
Creating numpy array by using an array function array(). This function takes argument dtype that allows us to define the expected data type of the array elements: Example 1: Python3.
dtype='<U32' is a little-endian 32 character string. The documentation on dtypes goes into more depth about each of the character. 'U' Unicode string. Several kinds of strings can be converted.
No, this is not a bug, this is intended behavior. +=
does an in-place addition, so it can't change the data type of the array x
. When the dtype is integral, that means the floating-point temporaries resulting from adding in the elements of a
get truncated to integers. Since np.random.random
returns floats in the range [0, 1)
, the result is always truncated back to the values in x
.
By contrast, x + a
needs to allocate a new array anyway, and upcasts the dtype of that new array to float when one argument is float and the other is integral.
The best way to avoid this problem is to be explicit about the required dtype in the arange
call:
x = np.arange(12, dtype=float)
x += np.random.random(size=12)
(Note that x += a
and x = x + a
are seldom equivalent in Python, since the latter typically modifies the object pointed to by x
. E.g. with pure Python lists:
a = []
b = a
a += [1]
modifies b
as well, while a = a + [1]
would leave b
untouched.)
x += a
modifies x
in-place: data will be cast to int
on assignment. x = x + a
will assign the result of x + a
to the label x
, and in this case x + a
will promote to a float
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With