Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Debugging reference counting memory leaks in Python C extension modules

I'm trying to determine if there are any reference counting memory leaks in a Python C extension module. Consider this very simple test extension that leaks a date object:

#include <Python.h>
#include <datetime.h>

static PyObject* memleak(PyObject *self, PyObject *args) {
    PyDate_FromDate(2000, 1, 1); /* deliberately create a memory leak */
    Py_RETURN_NONE;
}

static PyMethodDef memleak_methods[] = {
    {"memleak",  memleak, METH_NOARGS, "Leak some memory"},
    {NULL, NULL, 0, NULL}        /* Sentinel */
};

PyMODINIT_FUNC initmemleak(void) {
    PyDateTime_IMPORT;
    Py_InitModule("memleak", memleak_methods);
}

PyDate_FromDate creates a new reference (i.e. internally calls Py_INCREF) and since I never call Py_DECREF, this object will never get garbage collected.

However, when I call this function, the number of objects being tracked by the garbage collector doesn't seem to change before and after the function call:

Python 2.7.3 (default, Apr 10 2013, 05:13:16)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from memleak import memleak
>>> import gc
>>> gc.disable()
>>> gc.collect()
0
>>> len(gc.get_objects()) # get object count before
3581
>>> memleak()
>>> gc.collect()
0
>>> len(gc.get_objects()) # get object count after
3581

And I can't seem to find the leaked date object at all in the list of objects returned by gc.get_objects():

>>> from datetime import date
>>> print [obj for obj in gc.get_objects() if isinstance(obj, date)]
[]

Am I missing something here about how gc.get_objects() works? Is there another way to demonstrate that the memleak() function has a memory leak?

like image 217
del Avatar asked Sep 09 '13 07:09

del


People also ask

How do I check for memory leaks in a program?

The best approach to checking for the existence of a memory leak in your application is by looking at your RAM usage and investigating the total amount of memory been used versus the total amount available. Evidently, it is advisable to obtain snapshots of your memory's heap dump while in a production environment.

Can you have memory leaks in Python?

The Python program, just like other programming languages, experiences memory leaks. Memory leaks in Python happen if the garbage collector doesn't clean and eliminate the unreferenced or unused data from Python.


1 Answers

From the documentation of the gc module:

Since the collector supplements the reference counting already used in Python, you can disable the collector if you are sure your program does not create reference cycles.

So the gc module is used only to deal with references cycles. In your case there is no cycle, hence the date object isn't returned by the get_objects function.

In fact old versions of python did not have the garbage collector at all, they only used reference-counting. The garbage collector was introduced to avoid creating memory leaks with reference-cycles(since this can be done from the python side pretty easily, and you do not want that a pure-python programs create memory leaks).

To see that kind of memory leak you should call the memleak function in a loop and see that the memory used increases (slowly in your case).

There are also some 3rd party libraries that can be used to profile memory usage, see the Which Python memory profiler is recommended? question on SO.

like image 179
Bakuriu Avatar answered Oct 26 '22 14:10

Bakuriu