Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tracing memory leaks in Python (multiprocessing)

I have a multiprocessing application that leaks memory. However, the leak is not in the main process (according to Dowser and top) but in the subprocesses. Is there any way I can use Dowser (or similar tool) on subprocesses to trace the leak? If not, how do it trace it?

UPDATE:I spent lots of time trying to use heapy and gnibbler's code but I couldn't locate the leak. I then stopped cherrypy in the main process and started another one (with Dowser) in the subprocess. But after a few minutes CherryPy would stop listening to the port... :( So I'm still looking for a better idea.

like image 413
johndodo Avatar asked Oct 30 '12 07:10

johndodo


People also ask

How do you determine if there is a memory leak?

Running out of memory is the simplest way to identify a memory leak, and it's also the most common approach to uncovering one. That's also the most inconvenient way to find a leak. You'll probably notice your system slowing down before you run out of RAM and crash your application.


3 Answers

I have found memory_profiler very easy to use but I'm not sure how it interacts with multiprocessing since I've never used that module. See this answer for a short explanation and other answers in that thread for mention of other Python profilers.

like image 145
Sir.Rainbow Avatar answered Oct 22 '22 03:10

Sir.Rainbow


I have hunted down the memory leak (which was in external C library) by using muppy - great tool, I wish I found it sooner! Thanks all for the answers.

like image 39
johndodo Avatar answered Oct 22 '22 03:10

johndodo


I found a couple of posts that should prove quite helpful. Haven't had the time to digest all the information in them yet, but thought I'd post the links and allow you to have a look at them as well.

Marius Gedminas has two posts on hunting memleaks in a Python test suite. He's using the built-in gc and inspect modules and simply dumping object graphs onto disk as csv files, so the approach should work quite well even for mp applications.

I'll look into that my self later today when I get the time.

UPDATE

Marius released his test rig as an open source project called objgraph (link). It tracks the gc object references but allows you to print out helpful information like how many instances of which type were added after a function call, and it allows you to see complete reference chains for objects.

The docs are pretty self explanatory and I can't see a reason why it wouldn't work with mp applications just as well.

However if your memory leak is coming from some underlying c library then this might not help you. At least it should give you an idea where the leak is. If it turns out not to be in your python code then you might have to refactor your code so that you can run the relevant c-libraries in the main process and use something like Valgrind to detect the leak.


The original post http://mg.pov.lt/blog/hunting-python-memleaks.html

The one where he goes more into the tools he's using http://mg.pov.lt/blog/python-object-graphs.html

The post that got me started http://www.lshift.net/blog/2008/11/14/tracing-python-memory-leaks

like image 1
Matti Lyra Avatar answered Oct 22 '22 02:10

Matti Lyra