I am trying to understand the meaning of ndarray.data field in numpy (see memory layout section of the reference page on N-dimensional arrays), especially for views into arrays. To quote the documentation:
ndarray.data -- Python buffer object pointing to the start of the array’s data
According to this description, I was expecting this to be a pointer to the C-array underlying the instance of ndarray.
Consider x = np.arange(5, dtype=np.float64).
Form y as a view into x using a slice: y = x[3:1:-1].
I was expecting x.data to point at location of 0. and y.data to point at the location of 3.. I was expecting the memory pointer printed by y.data to thus be offset by 3*x.itemsize bytes from the memory pointer printed by x.data, but this does not appear to be the case:
>>> import numpy as np
>>> x = np.arange(5, dtype=np.float64)
>>> y = x[ 3:1:-1]
>>> x.data
<memory at 0x000000F2F5150348>
>>> y.data
<memory at 0x000000F2F5150408>
>>> int('0x000000F2F5150408', 16) - int('0x000000F2F5150348', 16)
192
>>> 3*x.itemsize
24
The 'data' key in __array_interface dictionary associated with the ndarray instance behaves more like I expect, although it may itself not be a pointer:
>>> y.__array_interface__['data'][0] - x.__array_interface__['data'][0]
24
So this begs the question, what does the
ndarray.datagive?
Thanks in advance.
<memory at 0x000000F2F5150348> is a memoryview object located at address 0x000000F2F5150348; the buffer it provides access to is located somewhere else.
Memoryviews provide a number of operations described in the relevant official documentation, but at least on the Python-side API, they do not provide any way to access the raw address of the memory they expose. Particularly, the at whatevernumber number is not what you're looking for.
Generally the number displayed by x.data isn't meant to be used by you. x.data is the buffer, which can be used in other contexts that expect a buffer.
np.frombuffer(x.data,dtype=float)
replicates your x.
np.frombuffer(x[3:].data,dtype=float)
this replicates x[3:]. But from Python you can't take x.data, add 192 bits (3*8*8) to it, and expect to get x[3:].
I often use the __array_interface__['data'] value to check whether two variables share a data buffer, but I don't use that number for any thing. These are informative numbers, not working values.
I recently explored this in
Creating a NumPy array directly from __array_interface__
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With