Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weird behaviour with numpy array operations

Explain this:

>>> a = np.arange(10)
>>> a[2:]
array([2, 3, 4, 5, 6, 7, 8, 9])
>>> a[:-2]
array([0, 1, 2, 3, 4, 5, 6, 7])
>>> a[2:] - a[:-2]
array([2, 2, 2, 2, 2, 2, 2, 2])
>>> a[2:] -= a[:-2]
>>> a
array([0, 1, 2, 2, 2, 3, 4, 4, 4, 5])

The expected result is of course array([0, 1, 2, 2, 2, 2, 2, 2, 2, 2]).

I'm going to guess this is something to do with numpy parallelising things and not being smart enough to work out that it needs to make a temporary copy of the data first (or do the operation in the correct order).

In other words I suspect it is doing something naive like this:

for i in range(2, len-2):
    a[i] -= a[i-2]

For reference it works in Matlab and Octave:

a = 0:9
a(3:end) = a(3:end) - a(1:end-2)

a =

  0  1  2  3  4  5  6  7  8  9

a =

  0  1  2  2  2  2  2  2  2  2

And actually it works fine if you do:

a[2:] = a[2:] - a[:-2]

So presumably this means that a -= b is not the same as a = a - b for numpy!

Actually now that I come to think of it, I think Mathworks gave this as one of the reasons for not implementing the +=, -=, /= and *= operators!

like image 927
Timmmm Avatar asked Dec 04 '13 14:12

Timmmm


People also ask

What is special about NumPy array?

NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use. NumPy uses much less memory to store data and it provides a mechanism of specifying the data types. This allows the code to be optimized even further.

How NumPy arrays are different from normal arrays?

NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will create a new array and delete the original. The elements in a NumPy array are all required to be of the same data type, and thus will be the same size in memory.

Why NumPy array operations are faster than looping?

NumPy is fast because it can do all its calculations without calling back into Python. Since this function involves looping in Python, we lose all the performance benefits of using NumPy. For a 10,000,000-entry NumPy array, this functions takes 2.5 seconds to run on my computer.

What does [: :] mean on NumPy arrays?

The [:, :] stands for everything from the beginning to the end just like for lists. The difference is that the first : stands for first and the second : for the second dimension. a = numpy. zeros((3, 3)) In [132]: a Out[132]: array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]])


1 Answers

When you slice a numpy array as you are doing in the example, you get a view of the data rather than a copy.

See:

http://scipy-lectures.github.io/advanced/advanced_numpy/#example-inplace-operations-caveat-emptor

like image 109
JoshAdel Avatar answered Sep 26 '22 15:09

JoshAdel