Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Working around memory leaks

I have a Python program that runs a series of experiments, with no data intended to be stored from one test to another. My code contains a memory leak which I am completely unable to find (I've look at the other threads on memory leaks). Due to time constraints, I have had to give up on finding the leak, but if I were able to isolate each experiment, the program would probably run long enough to produce the results I need.

  • Would running each test in a separate thread help?
  • Are there any other methods of isolating the effects of a leak?

Detail on the specific situation

  • My code has two parts: an experiment runner and the actual experiment code.
  • Although no globals are shared between the code for running all the experiments and the code used by each experiment, some classes/functions are necessarily shared.
  • The experiment runner isn't just a simple for loop that can be easily put into a shell script. It first decides on the tests which need to be run given the configuration parameters, then runs the tests then outputs the data in a particular way.
  • I tried manually calling the garbage collector in case the issue was simply that garbage collection wasn't being run, but this did not work

Update

Gnibbler's answer has actually allowed me to find out that my ClosenessCalculation objects which store all of the data used during each calculation are not being killed off. I then used that to manually delete some links which seems to have fixed the memory issues.

like image 595
Casebash Avatar asked Oct 29 '09 01:10

Casebash


People also ask

Does valgrind work with Python?

Valgrind is used periodically by Python developers to try to ensure there are no memory leaks or invalid memory reads/writes. If you want to use Valgrind more effectively and catch even more memory leaks, you will need to configure python --without-pymalloc.

How do you fix memory issues in Python?

the MemoryError in Python. A memory error is raised when a Python script fills all the available memory in a computer system. One of the most obvious ways to fix this issue is to increase the machine's RAM .


1 Answers

You can use something like this to help track down memory leaks

>>> from collections import defaultdict >>> from gc import get_objects >>> before = defaultdict(int) >>> after = defaultdict(int) >>> for i in get_objects(): ...     before[type(i)] += 1  ...  

now suppose the tests leaks some memory

>>> leaked_things = [[x] for x in range(10)] >>> for i in get_objects(): ...     after[type(i)] += 1 ...  >>> print [(k, after[k] - before[k]) for k in after if after[k] - before[k]] [(<type 'list'>, 11)] 

11 because we have leaked one list containing 10 more lists

like image 57
John La Rooy Avatar answered Sep 24 '22 23:09

John La Rooy