As I understand it, a numpy array is an object storing values in a contiguous block of memory, while Python's built-in containers (list, tuple, set, dict) contain references to objects. Basic slicing of a numpy array returns a view containing a specified subset of those values.
On the surface, the view looks like another numpy array (type(aView) returns numpy.ndarray), but its values are not copies but rather the same values as those of the original array; altering the values of a view in-place alters the values of the original array as well.
How does a view do this? I would imagine a view should contain some sort of pointers to the values in the original array, but a couple things give me pause:
The values in the array aren't objects, and I don't know how references can be made to small pieces of the same object.
creating a copy of an array is much slower than making a view, and if the view is just an array of some sort of pointer values, I would expect both copies and views to be made with about the same speed.
A NumPy array knows its base address, data type, shape, and strides. Most applications don't need to explicitly deal with the strides, but they are what make some of this work. The strides indicate how many bytes must be added to increment a given dimension by one logical unit (e.g. row).
If you start with a 3x3 array of float64 (aka f8
) at address 0x1000, and you want a view of the 2x2 subarray which starts in the center of the original, all you need is to increment the base address by 4 elements (3 for the entire first row, 1 to move from the left to the center of the middle row) and remember that the each row starts 24 bytes after the previous one (despite being only 16 bytes long).
Conceptually we go from this:
base=0x1000
shape=(3,3)
strides=(24,8)
dtype='f8'
To this:
base=0x1020 (added 1*24 + 1*8 for [1:,1:] view)
shape=(2,2)
strides=(24,8)
dtype='f8'
And the view takes these elements:
. . .
. 4 5
. 7 8
Some flags are adjusted on the view, such as the C_CONTIGUOUS flag which needs to be unset because the view is not a contiguous region anymore.
Strides not only support views of NumPy arrays, but also views of data structures which did not originate in NumPy. For example if you have a C array of structs and the first member of each is a point (x,y)
, you can construct a view of only these points by setting the stride to the size of the entire struct, despite the dtype being just the two numbers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With