Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sys.getsizeof(list) returns less than the sum of its elements

Tags:

python

list

I'm curious - why does the sys.getsizeof call return a smaller number for a list than the sum of its elements?

import sys
lst = ["abcde", "fghij", "klmno", "pqrst", "uvwxy"]
print("Element sizes:", [sys.getsizeof(el) for el in lst])
print("Sum of sizes: ", sum([sys.getsizeof(el) for el in lst]))
print("Size of list: ", sys.getsizeof(lst))

The above prints

Element sizes: [42, 42, 42, 42, 42]
Sum of sizes:  210
Size of list:  112

How come?

like image 734
EvenLisle Avatar asked May 05 '15 08:05

EvenLisle


Video Answer


2 Answers

You are getting the size of the actual list object. As the list object stores pointers to objects its memory size is bound to be different (and lower) than the sum of its elements.

By analogy, it’s like getting the size of an array of pointers in C.

like image 191
Tasos Vogiatzoglou Avatar answered Oct 09 '22 15:10

Tasos Vogiatzoglou


As per the documentation, sys.getsizeof does the following:

Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

So only very primitive types in built-in objects are you ever really going to get accurate results. Even for built-in container types, you usually need to use some sort of recursive function to find the "total" size of the container (list, dictionary, etc). Keep in mind, though, that a python list is really just a re-sizable array of pointers, so in a sense, it is an accurate number.

However, you are looking for something like this:

https://code.activestate.com/recipes/577504/

Also, note that:

>>> sys.getsizeof(npArrayList[0])
96
>>> 

Every numpy object -or any object for that matter- has some overhead, and when you assign a np.array as a list element, you create a new object, so really, the following only takes into account the memory of the array contents, and not the overhead of the whole object:

>>> npArrayList[0].nbytes
32
like image 28
juanpa.arrivillaga Avatar answered Oct 09 '22 15:10

juanpa.arrivillaga