Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does the Python 3 interpreter leak memory when embedded?

This bug report states that the Python interpreter, as of June 2007, will not clean up all allocated memory after calling Py_Finalize in a C/C++ application with an embedded Python interpreter. It was recommended to call Py_Finalize once at application termination.

This bug report states that as of version 3.3 and March 2011 the interpreter still leaks memory.

Does anyone know the current state of this issue? I am concerned because I have an application in which the interpreter is called multiple times per running instance and I am experiencing memory leaks.

I am already using boost::python to handle reference counts and I clear the global dictionary of all references created by running a Python program in between runs. I have some singleton classes - might these be the problem?

Is this a tractable issue or is it a bug in the Python interpreter?

like image 263
user1140116 Avatar asked Jan 10 '12 05:01

user1140116


1 Answers

You can see that the bug (the first one, from 2007) is closed as "wontfix" by nnorwitz, and his post is in the bug report.

Why do you call Py_Initialize/Py_Finalize more than once? Why not do something like this (I'm kinda mixing C and Python for convenience):

/* startup */
Py_Initialize();

/* do whatever */
while (moreFiles()) {
    PyRun_SimpleString("execfile('%s')" % nextFile());
    /* do whatever */
}

/* shutdown */
Py_Finalize();

The problem is that most people who write Python modules don't worry about what happens if their module gets finalized and reinitialized, and often don't care about cleaning up during finalization. Module authors know that all memory is released when the process exits, and don't bother with anything more than that.

So it's not really one bug, it's really a thousand bugs -- one for each extension module. It's an enormous amount of work for a bug that affects a minority of users, most of whom have a viable workaround.

You can always just omit the call to Py_Finalize, calling Py_Initialize a second time is a no-op. This means your application will use additional memory usage when you first run a Python script, and that additional memory won't get returned to the OS until you exit. As long as you're still running Python scripts every once in a while, I wouldn't categorize it as a leak. Your application might not be Valgrind-clean, but it's better than leaking like a sieve.

If you need to unload your (pure) Python modules to avoid leaking memory, you can do that. Just delete them from sys.modules.

Drawbacks of Py_Finalize: If you are executing Python scripts repeatedly, it doesn't make much sense to run Py_Finalize between them. You'll have to reload all the modules every time you reinitialize; my Python loads 28 modules at boot.

Additional commentary: The bug is not limited to Python. A significant amount of the library code in any language will leak memory if you try to unload and reload libraries. Lots of libraries call into C code, lots of C programmers assume that their libraries gets loaded once and unloaded when the process exits.

like image 110
Dietrich Epp Avatar answered Oct 03 '22 14:10

Dietrich Epp