Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to debug a MemoryError in Python? Tools for tracking memory use?

I have a Python program that dies with a MemoryError when I feed it a large file. Are there any tools that I could use to figure out what's using the memory?

This program ran fine on smaller input files. The program obviously needs some scalability improvements; I'm just trying to figure out where. "Benchmark before you optimize", as a wise person once said.

(Just to forestall the inevitable "add more RAM" answer: This is running on a 32-bit WinXP box with 4GB RAM, so Python has access to 2GB of usable memory. Adding more memory is not technically possible. Reinstalling my PC with 64-bit Windows is not practical.)

EDIT: Oops, this is a duplicate of Which Python memory profiler is recommended?

like image 989
user9876 Avatar asked Nov 05 '09 16:11

user9876


4 Answers

The simplest and lightweight way would likely be to use the built in memory query capabilities of Python, such as sys.getsizeof - just run it on your objects for a reduced problem (i.e. a smaller file) and see what takes a lot of memory.

like image 124
Eli Bendersky Avatar answered Nov 03 '22 04:11

Eli Bendersky


Heapy is a memory profiler for Python, which is the type of tool you need.

like image 37
Wim Avatar answered Nov 03 '22 05:11

Wim


In your case, the answer is probably very simple: Do not read the whole file at once but process the file chunk by chunk. That may be very easy or complicated depending on your usage scenario. Just for example, a MD5 checksum computation can be done much more efficiently for huge files without reading the whole file in. The latter change has dramatically reduced memory consumption in some SCons usage scenarios but was almost impossible to trace with a memory profiler.

If you still need a memory profiler: eliben already suggested sys.getsizeof. If that doesn't cut it, try Heapy or Pympler.

like image 2
Pankrat Avatar answered Nov 03 '22 05:11

Pankrat


You asked for a tool recommendation:

Python Memory Validator allows you to monitor the memory usage, allocation locations, GC collections, object instances, memory snapshots, etc of your Python application. Windows only.

http://www.softwareverify.com/python/memory/index.html

Disclaimer: I was involved in the creation of this software.

like image 1
Stephen Kellett Avatar answered Nov 03 '22 04:11

Stephen Kellett