Creating a numpy array in C from an allocated array is causing memory leaks

Tags:

I have traced a memory leak in my program to a Python module I wrote in C to efficiently parse an array expressed in ASCII-hex. (e.g. "FF 39 00 FC ...")

char* buf;
unsigned short bytesPerTable;
if (!PyArg_ParseTuple(args, "sH", &buf, &bytesPerTable))
{
    return NULL;
}

unsigned short rowSize = bytesPerTable;
char* CArray = malloc(rowSize * sizeof(char));

// Populate CArray with data parsed from buf
ascii_buf_to_table(buf, bytesPerTable, rowSize, CArray);

int dims[1] = {rowSize};

PyObject* pythonArray = PyArray_SimpleNewFromData(1, (npy_intp*)dims, NPY_INT8, (void*)CArray);
return Py_BuildValue("(O)", pythonArray);

I realized that numpy does not know to free the memory allocated for CArray, thus causing a memory leak. After some research into this issue, at the suggestion of comments in this article I added the following line which is supposed to tell the array that it "owns" its data, and to free it when it is deleted.

PyArray_ENABLEFLAGS((PyArrayObject*)pythonArray, NPY_ARRAY_OWNDATA);

But I am still getting the memory leak. What am I doing wrong? How do I get the NPY_ARRAY_OWNDATA flag to work properly?

For reference, the documentation in ndarraytypes.h makes it seem like this should work:

/*
 * If set, the array owns the data: it will be free'd when the array
 * is deleted.
 *
 * This flag may be tested for in PyArray_FLAGS(arr).
 */
#define NPY_ARRAY_OWNDATA         0x0004

Also for reference, the following code (calling the Python function defined in C) demonstrates the memory leak.

tableData = "FF 39 00 FC FD 37 FF FF F9 38 FE FF F1 39 FE FC \n" \
            "EF 38 FF FE 47 40 00 FB 3D 3B 00 FE 41 3D 00 FE \n" \
            "43 3E 00 FF 42 3C FE 02 3C 40 FD 02 31 40 FE FF \n" \
            "2E 3E FF FE 24 3D FF FE 15 3E 00 FC 0D 3C 01 FA \n" \
            "02 3E 01 FE 01 3E 00 FF F7 3F FF FB F4 3F FF FB \n" \
            "F1 3D FE 00 F4 3D FE 00 F9 3E FE FC FE 3E FD FE \n" \
            "F6 3E FE 02 03 3E 00 FE 04 3E 00 FC 0B 3D 00 FD \n" \
            "09 3A 00 01 03 3D 00 FD FB 3B FE FB FD 3E FD FF \n"

for i in xrange(1000000):
    PES = ParseTable(tableData, 128, 4) //Causes memory usage to skyrocket

725

asked Mar 06 '15 18:03

dpitch40

1 Answers

It's probably a reference-count issue (from How to extend NumPy):

One common source of reference-count errors is the Py_BuildValue function. Pay careful attention to the difference between the ‘N’ format character and the ‘O’ format character. If you create a new object in your subroutine (such as an output array), and you are passing it back in a tuple of return values, then you should most- likely use the ‘N’ format character in Py_BuildValue. The ‘O’ character will increase the reference count by one. This will leave the caller with two reference counts for a brand-new array. When the variable is deleted and the reference count decremented by one, there will still be that extra reference count, and the array will never be deallocated. You will have a reference-counting induced memory leak. Using the ‘N’ character will avoid this situation as it will return to the caller an object (inside the tuple) with a single reference count.

124

answered Sep 16 '22 12:09

Ulfalizer

Related questions
                            
                                How do I install Numpy for Python 2.7 on Windows?
                            
                                How to find current QLocale in Qt/PyQt/PySide?
                            
                                Why does python VM have co_names instead of just using co_consts?
                            
                                Python md5 hashes of same gzipped file are inconsistent
                            
                                python points to global installation even after virtualenv activation
                            
                                How to prevent PyDev's autopep8 import formatter from moving site.addsitedir() calls?
                            
                                PySide Qt tr() does not translate, translate() does - context wrong?
                            
                                Comparing two lists of coordinates in python and using coordinate values to assign values
                            
                                Python Curses - module 'curses' has no attribute 'LINES'
                            
                                Using Python to Resize Images when Greater than 1280 on either side
                            
                                Numpy: get 1D array as 2D array without reshape
                            
                                Python Pyramid periodic task
                            
                                Understanding A* heuristics for single goal maze
                            
                                Share choices across Django apps
                            
                                label matplotlib imshow axes with strings
                            
                                cython: memory view of ndarray of strings (or direct ndarray indexing)
                            
                                Django compilemessages doesn't create .mo files
                            
                                Are null bytes allowed in unicode strings in PostgreSQL via Python?
                            
                                How to unserstand the code using izip_longest to chunk a list?
                            
                                django: use namedtuple instead of dict for **kwargs?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Creating a numpy array in C from an allocated array is causing memory leaks

Tags:

python

c

memory-leaks

malloc

numpy

dpitch40

People also ask

1 Answers

Ulfalizer

Recent Activity

Donate For Us