Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Memory leak?

Query in Python interpreter:

Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> k = [i for i in xrange(9999999)]
>>> import sys
>>> sys.getsizeof(k)/1024/1024
38
>>>

And here - see how much it takes from RAM:


Memory usage after statement del k:

And after gc.collect():

Why list of integers with expected size of 38Mb takes 160Mb?

UPD: This part of question was answered (almost immediately and multiple times :))

Okay - here is another riddle:

Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys

>>> str = 'abcdefg'
>>> sys.getsizeof(str)
28
>>> k = []
>>> for i in xrange(9999999):
...     k.append(str)
...
>>> sys.getsizeof(str)*9999999/1024/1024
267

How much do you think it will consume now?


(source: i.imm.io)

Size of str is 28, vs 12 in past example. So, expected memory usage is 267Mb - even more then with integers. But it takes only ~40Mb!

like image 868
Gill Bates Avatar asked Dec 14 '12 16:12

Gill Bates


People also ask

How do you fix a memory leak in Python?

The use of debugging method to solve memory leaks You'll have to debug memory usage in Python using the garbage collector inbuilt module. That will provide you a list of objects known by the garbage collectors. Debugging allows you to see where much of the Python storage memory is being applied.

How do I know if my code has a memory leak?

To find a memory leak, look at how much RAM the system is using. The Resource Monitor in Windows can be used to accomplish this. In Windows 8.1 and Windows 10: To open the Run dialogue, press Windows+R, then type "resmon" and click OK.

Does Python consume a lot of memory?

Those numbers can easily fit in a 64-bit integer, so one would hope Python would store those million integers in no more than ~8MB: a million 8-byte objects. In fact, Python uses more like 35MB of RAM to store these numbers. Why? Because Python integers are objects, and objects have a lot of memory overhead.


2 Answers

sys.getsizeof() is not very useful because it accounts often for only a part of what you expect. In this case, it accounts for the list, but not all integer objects that are in the list. The list takes roughly 4 bytes per item. The integer objects take another 12 bytes each. For example, if you try this:

k = [42] * 9999999
print sys.getsizeof(k)

you'll see that the list still takes 4 bytes per item, i.e. around 40MB, but because all items are pointers to the same integer object 42, the total memory usage is not much more than 40MB.

like image 148
Armin Rigo Avatar answered Oct 15 '22 17:10

Armin Rigo


What is getsizeof()

At first I propose to take a look at what the size-of operator means. You can find the exact description in the documentation. I want to zoom-in on the following sentence.

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

This means that when you ask sys.getsizeof([a]) you don't get the actual size of the array. You only get the size of all memory that is dedicated to managing the list. The list still contains 9999999 integers. Each integer consists of 12 bytes which leads to a total of 114 MB. The sum of the memory dedicated to managing the array 32MB plus the sum of the memory of the data in the array is 146 Mb which comes a lot closer to your result.

like image 35
Erik Avatar answered Oct 15 '22 16:10

Erik