If you change a view of a numpy array, the original array is also altered. This is intended behaviour.
arr = np.array([1,2,3])
mask = np.array([True, False, False])
arr[mask] = 0
arr
# Out: array([0, 2, 3])
However, if I take a view of such a view, and change that, then the original array is not altered:
arr = np.array([1,2,3])
mask_1 = np.array([True, False, False])
mask_1_arr = arr[mask_1] # Becomes: array([1])
mask_2 = np.array([True])
mask_1_arr[mask_2] = 0
arr
# Out: array([1, 2, 3])
This implies to me that, when you take a view of a view, you actually get back a copy. Is this correct? Why is this?
The same behaviour occurs if I use numpy arrays of numerical indices instead of a numpy array of boolean values. (E.g. arr[np.array([0])][np.array([0])] = 0
doesn't change the first element of arr
to 0.)
Selection by basic slicing always returns a view. Selection by advanced indexing always returns a copy. Selection by boolean mask is a form of advanced indexing. (The other form of advanced indexing is selection by integer array.)
However, assignment by advanced indexing affects the original array.
So
mask = np.array([True, False, False])
arr[mask] = 0
affects arr
because it is an assignment. In contrast,
mask_1_arr = arr[mask_1]
is selection by boolean mask, so mask_1_arr
is a copy of part of arr
.
Once you have a copy, the jig is up. When Python executes
mask_2 = np.array([True])
mask_1_arr[mask_2] = 0
the assignment affects mask_1_arr
, but since mask_1_arr
is a copy,
it has no effect on arr
.
| | basic slicing | advanced indexing |
|------------+------------------+-------------------|
| selection | view | copy |
| assignment | affects original | affects original |
Under the hood, arr[mask] = something
causes Python to call
arr.__setitem__(mask, something)
. The ndarray.__setitem__
method is
implemented to modify arr
. After all, that is the natural thing one should expect
__setitem__
to do.
In contrast, as an expression arr[indexer]
causes Python to call
arr.__getitem__(indexer)
. When indexer
is a slice, the regularity of the
elements allows NumPy to return a view (by modifying the strides and offset). When indexer
is an arbitrary boolean mask or arbitrary array of integers, there is in general
no regularity to the elements selected, so there is no way to return a
view. Hence a copy must be returned.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With