Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the memory usage of a Python list smaller than expected?

enter image description here

As seen in the picture. 50 000 000 records only take 404M memory, why? Since one record takes 83 Bytes, 50 000 000 records should take 3967M memory.

>>> import sys
>>> a=[]
>>> for it in range(5*10**7):a.append("miJ8ZNFG9iFqiQQohvyTWwqsij2rJCiZ7v"+str(it))
... 
>>> print(sys.getsizeof(a)/1024**2)
404.4306411743164
>>> print(sys.getsizeof("miJ8ZNFG9iFqiQQohvyTWwqsij2rJCiZ7v"))
83
>>> print(83*5*10**7/1024**2)
3957.7484130859375
>>> 
like image 551
purplecity Avatar asked Jan 17 '19 03:01

purplecity


People also ask

How much memory do Python lists use?

When you create a list object, the list object by itself takes 64 bytes of memory, and each item adds 8 bytes of memory to the size of the list because of references to other objects.

Why does Python consume so much memory?

Those numbers can easily fit in a 64-bit integer, so one would hope Python would store those million integers in no more than ~8MB: a million 8-byte objects. In fact, Python uses more like 35MB of RAM to store these numbers. Why? Because Python integers are objects, and objects have a lot of memory overhead.

How does Python store a list in memory?

The list is based on an array. An array is a set of elements ① of the same size, ② located in memory one after another, without gaps. Since elements are the same size and placed contiguously, it is easy to get an array item by index. All we need is the memory address of the very first element (the “head” of the array).


1 Answers

sys.getsizeof only reports the cost of the list itself, not its contents. So you're seeing the cost of storing the list object header, plus (a little over) 50M pointers; you're likely on a 64 bit (eight byte) pointer system, thus storage for 50M pointers is ~400 MB. Getting the true size would require sys.getsizeof to be called for each object, each object's __dict__ (if applicable), etc., recursively, and it won't be 100% accurate since some of the objects (e.g. small ints) are likely shared; this is not a rabbit hole you want to go down.

like image 60
ShadowRanger Avatar answered Oct 31 '22 18:10

ShadowRanger