arr = np.arange(0,11)
slice_of_arr = arr[0:6]
slice_of_arr[:]=99
# slice_of_arr returns
array([99, 99, 99, 99, 99, 99])
# arr returns
array([99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10])
As the example shown above, you cannot directly change the value of the slice_of_arr
, because it's a view of arr
, not a new variable.
My questions are:
.copy
and then assign value?.copy
? How can I change this default behavior of NumPy?The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy. The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.
If you are dealing with NumPy matrices, then numpy. copy() will give you a deep copy.
copy() is supposed to create a shallow copy of its argument, but when applied to a NumPy array it creates a shallow copy in sense B, i.e. the new array gets its own copy of the data buffer, so changes to one array do not affect the other. x_copy = x. copy() is all you need to make a copy of x .
copy() function creates a deep copy. It is a complete copy of the array and its data, and doesn't share with the original array.
I think you have the answers in the other comments, but more specifically:
1.a. Why does NumPy design like this?
Because it's way faster (constant time) to create a view rather than creating a whole array (linear time).
1.b. Wouldn't it be tedious every time you need to .copy and then assign value?
Actually it's not that common to need to create a copy. So no, it's not tedious. Even if it can be surprising at first this design is very good.
2.a. Is there anything I can do, to get rid of the .copy?
I can't really tell without seing real code. In the toy example you give, you can't avoid creating a copy, but in real code you usually apply functions to the data, which return another array so a copy isn't needed.
Can you give an example of real code where you need to call .copy
repeatedly ?
2.b. How can I change this default behavior of NumPy?
You can't. Try to get used to it, you'll see how powerfull it is.
What does (numpy) __array_wrap__ do?
talks about ndarray
subclasses and hooks like __array_wrap__
. np.array
takes copy
parameter, forcing the result to be a copy, even if it isn't required by other considerations. ravel
returns a view, flatten
a copy. So it is probably possible, and maybe not too difficult, to construct a ndarray
subclass that forces a copy. It may involve modifying a hook like __array_wrap__
.
Or maybe modifying the .__getitem__
method. Indexing as in slice_of_arr = arr[0:6]
involves a call to __getitem__
. For ndarray
this is compiled, but for a masked array, it is python code that you could use as an example:
/usr/lib/python3/dist-packages/numpy/ma/core.py
It may be something as simple as
def __getitem__(self, indx):
"""x.__getitem__(y) <==> x[y]
"""
# _data = ndarray.view(self, ndarray) # change to:
_data = ndarray.copy(self, ndarray)
dout = ndarray.__getitem__(_data, indx)
return dout
But I suspect that by the time you develop and fully test such a subclass, you might fall in love with the default no-copy approach. While this view-v-copy business bites many new comers (especially if coming from MATLAB), I haven't seen complaints from experienced users. Look at other numpy SO questions; you won't see a lot copy()
calls.
Even regular Python users are used asking themselves whether a reference or slice is a copy or not, and whether something is mutable or not.
for example with lists:
In [754]: ll=[1,2,[3,4,5],6]
In [755]: llslice=ll[1:-1]
In [756]: llslice[1][1:2]=[10,11,12]
In [757]: ll
Out[757]: [1, 2, [3, 10, 11, 12, 5], 6]
modifying an item an item inside a slice modifies that same item in the original list. In contrast to numpy
, a list slice is a copy. But it's a shallow copy. You have to take extra effort to make a deep copy (import copy
).
/usr/lib/python3/dist-packages/numpy/lib/index_tricks.py
contains some indexing functions aimed at making certain indexing operations more convenient. Several are actually classes, or class instances, with custom __getitem__
methods. They may also serve as models of how to customize your slicing and indexing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With