Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Behavior of ndarray.data for views in numpy

I am trying to understand the meaning of ndarray.data field in numpy (see memory layout section of the reference page on N-dimensional arrays), especially for views into arrays. To quote the documentation:

ndarray.data -- Python buffer object pointing to the start of the array’s data

According to this description, I was expecting this to be a pointer to the C-array underlying the instance of ndarray.

Consider x = np.arange(5, dtype=np.float64).

Form y as a view into x using a slice: y = x[3:1:-1].

I was expecting x.data to point at location of 0. and y.data to point at the location of 3.. I was expecting the memory pointer printed by y.data to thus be offset by 3*x.itemsize bytes from the memory pointer printed by x.data, but this does not appear to be the case:

>>> import numpy as np
>>> x = np.arange(5, dtype=np.float64)
>>> y = x[ 3:1:-1]
>>> x.data
<memory at 0x000000F2F5150348>
>>> y.data
<memory at 0x000000F2F5150408>
>>> int('0x000000F2F5150408', 16) - int('0x000000F2F5150348', 16)
192
>>> 3*x.itemsize
24

The 'data' key in __array_interface dictionary associated with the ndarray instance behaves more like I expect, although it may itself not be a pointer:

>>> y.__array_interface__['data'][0] - x.__array_interface__['data'][0]
24

So this begs the question, what does the ndarray.data give?

Thanks in advance.

like image 683
user40314 Avatar asked May 10 '26 07:05

user40314


2 Answers

<memory at 0x000000F2F5150348> is a memoryview object located at address 0x000000F2F5150348; the buffer it provides access to is located somewhere else.

Memoryviews provide a number of operations described in the relevant official documentation, but at least on the Python-side API, they do not provide any way to access the raw address of the memory they expose. Particularly, the at whatevernumber number is not what you're looking for.

like image 106
user2357112 supports Monica Avatar answered May 12 '26 20:05

user2357112 supports Monica


Generally the number displayed by x.data isn't meant to be used by you. x.data is the buffer, which can be used in other contexts that expect a buffer.

np.frombuffer(x.data,dtype=float)

replicates your x.

np.frombuffer(x[3:].data,dtype=float)

this replicates x[3:]. But from Python you can't take x.data, add 192 bits (3*8*8) to it, and expect to get x[3:].

I often use the __array_interface__['data'] value to check whether two variables share a data buffer, but I don't use that number for any thing. These are informative numbers, not working values.

I recently explored this in

Creating a NumPy array directly from __array_interface__

like image 33
hpaulj Avatar answered May 12 '26 20:05

hpaulj