Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does assignment with advanced indexing copy array data?

Tags:

python

copy

numpy

I am slowly trying to understand the difference between views and copys in numpy, as well as mutable vs. immutable types.

If I access part of an array with 'advanced indexing' it is supposed to return a copy. This seems to be true:

In [1]: import numpy as np
In [2]: a = np.zeros((3,3))
In [3]: b = np.array(np.identity(3), dtype=bool)

In [4]: c = a[b]

In [5]: c[:] = 9

In [6]: a
Out[6]: 
array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

Since c is just a copy, it does not share data and changing it does not mutate a. However, this is what confuses me:

In [7]: a[b] = 1

In [8]: a
Out[8]: 
array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

So, it seems, even if I use advanced indexing, assignment still treats the thing on the left as a view. Clearly the a in line 2 is the same object/data as the a in line 6, since mutating c has no effect on it.

So my question: is the a in line 8 the same object/data as before (not counting the diagonal of course) or is it a copy? In other words, was a's data copied to the new a, or was its data mutated in place?

For example, is it like:

x = [1,2,3]
x += [4]

or like:

y = (1,2,3)
y += (4,)

I don't know how to check for this because in either case, a.flags.owndata is True. Please feel free to elaborate or answer a different question if I'm thinking about this in a confusing way.

like image 988
askewchan Avatar asked Mar 28 '13 20:03

askewchan


2 Answers

When you do c = a[b], a.__get_item__ is called with b as its only argument, and whatever gets returned is assigned to c.

When you doa[b] = c, a.__setitem__ is called with b and c as arguments and whatever gets returned is silently discarded.

So despite having the same a[b] syntax, both expressions are doing different things. You could subclass ndarray, overload this two functions, and have them behave differently. As is by default in numpy, the former returns a copy (if b is an array) but the latter modifies a in place.

like image 96
Jaime Avatar answered Sep 27 '22 17:09

Jaime


Yes, it is the same object. Here's how you check:

>>> a
array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])
>>> a2 = a
>>> a[b] = 1
>>> a2 is a
True
>>> a2
array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

Assigning to some expression in Python is not the same as just reading the value of that expression. When you do c = a[b], with a[b] on the right of the equals sign, it returns a new object. When you do a[b] = 1, with a[b] on the left of the equals sign, it modifies the original object.

In fact, an expression like a[b] = 1 cannot change what name a is bound to. The code that handles obj[index] = value only gets to know the object obj, not what name was used to refer to that object, so it can't change what that name refers to.

like image 39
BrenBarn Avatar answered Sep 27 '22 18:09

BrenBarn