Potential memory leak when converting wide char to python string

Question

I have the following code in in cython in the pyx file, which converts wchar_t* to python string (unicode)

// All code below is python 2.7.4

cdef wc_to_pystr(wchar_t *buf):
    if buf == NULL:
        return None
    cdef size_t buflen
    buflen = wcslen(buf)
    cdef PyObject *p = PyUnicode_FromWideChar(buf, buflen)
    return <unicode>p

I called this function in a loop like this:

cdef wchar_t* buf = <wchar_t*>calloc(100, sizeof(wchar_t))
# ... copy some wide string to buf

for n in range(30000):
    u = wc_to_pystr(buf) #<== behaves as if its a memory leak

free(buf)

I tested this on Windows and the observation is that the memory (as seen in Task Manager) keeps on increasing and hence I suspect that there could be a memory leak here.

This is surprising because:

As per my understanding the API PyUnicode_FromWideChar() copies the supplied buffer.
Every-time the variable 'u' is assigned a different value, the previous value should be freed-up
Since the source buffer ('buf') remains as is and is released only after the loop ends, I was expecting that memory should not increase after a certain point at all

Any idea where am I going wrong? Is there a better way to implement Wide Char to python unicode object?

bitflood · Accepted Answer

Solved!! Solution:

(Note: The solution refers to a piece of my code which was not in the question originally. I had no clue while posting that it would hold the key to solve this. Sorry to those who gave it a thought to solve ... )

In cython pyx file, I had declared the python API like:

PyObject* PyUnicode_FromWideChar(const wchar_t *w, Py_ssize_t size)

I checked out the docs at https://github.com/cython/cython/blob/master/Cython/Includes/cpython/init.pxd

I had declared return type as PyObject* and hence an additional ref was created which I was not deref-ing explicitly. Solution was to change the return type in the signature like:

object PyUnicode_FromWideChar(const wchar_t *w, Py_ssize_t size)

As per docs adding 'object' as return type does not increment any ref count and hence in the for loop memory is freed-up correctly. The modified 'wc_to_pystr' looks like this:

cdef wc_to_pystr(wchar_t *buf):
    if buf == NULL:
        return None
    cdef size_t buflen
    buflen = wcslen(buf)
    p = PyUnicode_FromWideChar(buf, buflen)
    return p

Potential memory leak when converting wide char to python string

Tags:

python

unicode

python-2.7

cython

bitflood

1 Answers

bitflood

Recent Activity

Donate For Us

Potential memory leak when converting wide char to python string

Tags:

python

unicode

python-2.7

cython

bitflood

1 Answers

bitflood

Related questions

Recent Activity

Donate For Us