Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python memory usage: Which of my objects is hogging the most memory?

The program I've written stores a large amount of data in dictionaries. Specifically, I'm creating 1588 instances of a class, each of which contains 15 dictionaries with 1500 float to float mappings. This process has been using up the 2GB of memory on my laptop pretty quickly (I start writing to swap at about the 1000th instance of the class).

My question is, which of the following is using up my memory?

  • 34 million some pairs of floats?
  • The overhead of 22,500 dictionaries?
  • the overhead of 1500 classes?

To me it seems like the memory hog should be the huge number of floating point numbers that I'm holding in memory. However, If what I've read so far is correct, each of my floating point numbers take up 16 bytes. Since I have 34 million pairs, this should be about 108 million bytes, which should be just over a gigabyte.

Is there something I'm not taking into consideration here?

like image 964
Wilduck Avatar asked Jun 21 '10 16:06

Wilduck


People also ask

Why does my Python program use so much memory?

Those numbers can easily fit in a 64-bit integer, so one would hope Python would store those million integers in no more than ~8MB: a million 8-byte objects. In fact, Python uses more like 35MB of RAM to store these numbers. Why? Because Python integers are objects, and objects have a lot of memory overhead.

How do I check the memory of a Python object?

In this article, we are going to see how to find out how much memory is being used by an object in Python. For this, we will use sys. getsizeof() function can be done to find the storage size of a particular object that occupies some space in the memory. This function returns the size of the object in bytes.

How do I find a memory leak in Python?

You can detect memory leaks in Python by monitoring your Python app's performance via an Application Performance Monitoring tool such as Scout APM. Once you detect a memory leak, there are multiple ways to solve it.


1 Answers

The floats do take up 16 bytes apiece, and a dict with 1500 entries about 100k:

>> sys.getsizeof(1.0)
16
>>> d = dict.fromkeys((float(i) for i in range(1500)), 2.0)
>>> sys.getsizeof(d)
98444

so the 22,500 dicts take over 2GB all by themselves, the 68 million floats another GB or so. Not sure how you compute 68 million times 16 equal only 100M -- you may have dropped a zero somewhere.

The class itself takes up a negligible amount, and 1500 instances thereof (net of the objects they refer to of course, just as getsizeof gives us such net amounts for the dicts) not much more than a smallish dict each, so, that's hardly the problem. I.e.:

>>> sys.getsizeof(Sic)
452
>>> sys.getsizeof(Sic())
32
>>> sys.getsizeof(Sic().__dict__)
524

452 for the class, (524 + 32) * 1550 = 862K for all the instances, as you see that's not the worry when you have gigabytes each in dicts and floats.

like image 98
Alex Martelli Avatar answered Oct 07 '22 21:10

Alex Martelli