Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cython: Memory view of freed memory

In Cython code, I can allocate some memory and wrap it in a memory view, e.g. like this:

cdef double* ptr
cdef double[::1] view
ptr = <double*> PyMem_Malloc(N*sizeof('double'))
view = <double[:N]> ptr

If I now free the memory using PyMem_Free(ptr), trying to access elements like ptr[i] throws an error, as it should. However, I can safely try to access view[i] (it does not return the original data though).

My question is this: Is it always safe to just deallocate the pointer? Is the memory view object somehow informed of the memory being freed, or should I manually remove the view somehow? Also, is the memory guaranteed to be freed, even though it is referred to by memory views?

like image 754
jmd_dk Avatar asked Mar 28 '16 15:03

jmd_dk


People also ask

Is Cython garbage collected?

They are full featured, garbage collected and much easier to work with than bare pointers in C, while still retaining the speed and static typing benefits. See Working with Python arrays and Typed Memoryviews.

What is Python Memoryview?

Memory view memoryview objects allow Python code to access the internal data of an object that supports the buffer protocol without copying. The memoryview() function allows direct read and write access to an object's byte-oriented data without needing to copy it first.


1 Answers

It requires a bit of digging into the C code to show this, but:

The line view = <double[:N]> ptr actually generates a __pyx_array_obj. This is the same type detailed in the documentation as a "Cython array" and cimportable as cython.view.array. The Cython array does have an optional member called callback_free_data that can act as a destructor.

The line translates as:

struct __pyx_array_obj *__pyx_t_1 = NULL;
# ...
__pyx_t_1 = __pyx_array_new(__pyx_t_2, sizeof(double), PyBytes_AS_STRING(__pyx_t_3), (char *) "c", (char *) __pyx_v_ptr);

(__pyx_t_2 and __pyx_t_3 are just temporaries storing the size and format respectively). If we look inside __pyx_array_new we see firstly that the array's data member is assigned directly to the value passed as __pyx_v_ptr

__pyx_v_result->data = __pyx_v_buf;

(i.e. a copy is not made) and secondly that callback_free_data is not set. Side note: The C code for cython.view.array is actually generated from Cython code so if you want to investigate further it's probably easier to read that than the generated C.


Essentially, the memoryview holds a cython.view.array which has a pointer to the original data, but no callback_free_data set. When the memoryview dies the destructor for the cython.view.array is called. This cleans up a few internals, but does nothing to free the data it points to (since it has no indication of how to do so).

It is therefore not safe to access the memoryview after you have called PyMem_Free. That fact you seem to get away with it is luck. It is safe for the memoryview to keep existing though, providing you don't access it. A function like:

def good():
    cdef double* ptr
    cdef double[::1] view
    ptr = <double*> PyMem_Malloc(N*sizeof('double'))
    try:
        view = <double[:N]> ptr
        # some other stuff
    finally:
        PyMem_Free(ptr)
    # some other stuff not involving ptr or view

would be fine. A function like:

def bad():
    cdef double* ptr
    cdef double[::1] view
    ptr = <double*> PyMem_Malloc(N*sizeof('double'))
    try:
        view = <double[:N]> ptr
        # some other stuff
    finally:
        PyMem_Free(ptr)
    view[0] = 0
    return view

would be a bad idea since it's passing back a memoryview that doesn't point to anything, and accessing view after the data it views has been freed.

You should definitely make sure to call PyMem_Free at some point, otherwise you have a memory leak. One way of doing it if view gets passed around and so the lifetime is hard to track would be to manually create a cython.view.array with callback_free_data set:

cdef view.array my_array = view.array((N,), allocate_buffer=False)
my_array.data = <char *> ptr
my_array.callback_free_data = PyMem_Free
view = my_array

If the lifetime of view is obvious then you can just call PyMem_Free on ptr as you've been doing.

like image 56
DavidW Avatar answered Sep 28 '22 06:09

DavidW