Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> k = [i for i in xrange(9999999)]
>>> import sys
>>> sys.getsizeof(k)/1024/1024
38
>>>
del k
:gc.collect()
:Why list of integers with expected size of 38Mb takes 160Mb?
UPD: This part of question was answered (almost immediately and multiple times :))
Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> str = 'abcdefg'
>>> sys.getsizeof(str)
28
>>> k = []
>>> for i in xrange(9999999):
... k.append(str)
...
>>> sys.getsizeof(str)*9999999/1024/1024
267
(source: i.imm.io)
Size of str
is 28, vs 12 in past example. So, expected memory usage is 267Mb - even more then with integers. But it takes only ~40Mb!
The use of debugging method to solve memory leaks You'll have to debug memory usage in Python using the garbage collector inbuilt module. That will provide you a list of objects known by the garbage collectors. Debugging allows you to see where much of the Python storage memory is being applied.
To find a memory leak, look at how much RAM the system is using. The Resource Monitor in Windows can be used to accomplish this. In Windows 8.1 and Windows 10: To open the Run dialogue, press Windows+R, then type "resmon" and click OK.
Those numbers can easily fit in a 64-bit integer, so one would hope Python would store those million integers in no more than ~8MB: a million 8-byte objects. In fact, Python uses more like 35MB of RAM to store these numbers. Why? Because Python integers are objects, and objects have a lot of memory overhead.
sys.getsizeof()
is not very useful because it accounts often for only a part of what you expect. In this case, it accounts for the list, but not all integer objects that are in the list. The list takes roughly 4 bytes per item. The integer objects take another 12 bytes each. For example, if you try this:
k = [42] * 9999999
print sys.getsizeof(k)
you'll see that the list still takes 4 bytes per item, i.e. around 40MB, but because all items are pointers to the same integer object 42, the total memory usage is not much more than 40MB.
At first I propose to take a look at what the size-of operator means. You can find the exact description in the documentation. I want to zoom-in on the following sentence.
Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.
This means that when you ask sys.getsizeof([a]) you don't get the actual size of the array. You only get the size of all memory that is dedicated to managing the list. The list still contains 9999999 integers. Each integer consists of 12 bytes which leads to a total of 114 MB. The sum of the memory dedicated to managing the array 32MB plus the sum of the memory of the data in the array is 146 Mb which comes a lot closer to your result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With