Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy: strange different behavior of inplace and explicit operation

I want to operate on numpy arrays to use their indexing, and I want to include the 0-dimensional case. Now I came across a strange situation, where a type conversion appears, if I don't use in-place multiplication:

In [1]: import numpy as np

In [2]: x = 1.*np.array(1.)

In [3]: y = np.array(1.)

In [4]: y *= 1.

In [5]: x
Out[5]: 1.0

In [6]: y
Out[6]: array(1.)

In [7]: type(x)
Out[7]: numpy.float64

In [8]: type(y)
Out[8]: numpy.ndarray

Why is the type of x different to y? I know, that the inplace operations are differently implemented and they don't create a copy of the array, but I don't get the point, why the type is changed, if I multiply a 0d-array with a float? It works for 1d-arrays:

In [1]: import numpy as np

In [2]: x = np.array(1.)

In [3]: y = np.array([1.])

In [4]: 1.*x
Out[4]: 1.0

In [5]: 1.*y
Out[5]: array([1.])

In [7]: type(1.*x)
Out[7]: numpy.float64

In [8]: type(1.*y)
Out[8]: numpy.ndarray

I think, that's strange... Now I run into the following problem, where I would have to treat 0d-array separately:

In [1]: import numpy as np

In [2]: x = np.array(1.)

In [3]: y = np.array(1.)*1.

In [4]: x[x>0]
Out[4]: array([1.])

In [5]: y[y>0]
Out[5]: array([1.])

In [6]: x[x>0] = 2.

In [7]: y[y>0] = 2.
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-5f9c5b138fc0> in <module>()
----> 1 y[y>0] = 2.

TypeError: 'numpy.float64' object does not support item assignment
like image 728
greeeeeeen Avatar asked Aug 20 '18 12:08

greeeeeeen


1 Answers

Ultimately this behavior comes down to free choices made by the developer(s), and so no good explanation need necessarily exist. However, I would like to defend/explain the observed behavior as follows.

In the case of

y = np.array(1.)
y *= 1.

we create a np.ndarray object y, and then do an operation on it. Here, the most natural behavior is for the operation to (maybe) alter the value of y, while the type should stay the same. This is indeed what actually happens.

As an aside, note the distinction between type and NumPy data type (or dtype). Had we started out with y = np.array(1) (dtype of np.int64), the operation y *= 1. is now illegal as this would need to change the dtype in-place!

For the case x = 1.*np.array(1.), let's white it out as

x1 = 1.
x2 = np.array(1.)
x = x1*x2

Here, we do not create an object and then operate on it. Instead we create two objects, x1 and x2, and then combine them into a third object, x, using a symmetric operation (here binary multiplication). As x1 and x2 happen to have different (but compatible) types, the type of x is non-obvious: It could equally well be the type of x1 (float) or the type of x2 (numpy.ndarray). Surprisingly, the actual answer is neither, as the type of x is np.float64. This behavior stems from two separate choices.

Choice 1

Combining a 0-dimensional array with a scalar results in a scalar, not a 0-dimensional array. This is really the choice that trips you up. I suppose it might as well have been chosen the other way around. A global switch (e.g. np.return_scalar = False) would be a nice feature to have!

Choice 2

Combining NumPy numeric data types with standard Python numeric types results in NumPy numeric data types. Here, the first category include things like np.int64, np.float64, np.complex128 (and many more), while the latter consists only of int, float and complex (for Python 2, also long). Thus, float times np.float64 results in np.float64.

Taken together, the two choices indeed makes x = 1.*np.array(1.) a NumPy scalar of dtype np.float64.

like image 58
jmd_dk Avatar answered Nov 18 '22 05:11

jmd_dk