Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does python seem to allocate more memory than sys.getsizeof accounts for?

Tags:

python

Example:

import sys

class Test():
    def __init__(self):
        self.a = 'a'
        self.b = 'b'
        self.c = 'c'
        self.d = 'd'
        self.e = 'e'

if __name__ == '__main__':
    test = [Test() for i in range(100000)]
    print(sys.getsizeof(test))

In windows task manager: I am getting a jump of ~20 MB when creating a list of 100000 vs 10.

Using sys.getsizeoff(): For a list of 100000, I get 412,236 bytes; for a list of 10, I get 100 bytes.

This seems hugely disproportionate. Why is this happening?

like image 616
Jeff Avatar asked Oct 22 '22 14:10

Jeff


2 Answers

The memory assigned is not disproportional; you are creating 100,000 objects! As you can see, they take up roughly 34 megabytes of space:

>>> sys.getsizeof(Test())+sys.getsizeof(Test().__dict__)
344
>>> (sys.getsizeof(Test())+sys.getsizeof(Test().__dict__)) * 1000000 / 10**6
34.4 #megabytes

You can get a minor improvement with __slots__, but you will still need about 20MB of memory to store those 100,000 objects.

>>> sys.getsizeof(Test2())+sys.getsizeof(Test2().__slots__)
200
>>> sys.getsizeof(Test2())+sys.getsizeof(Test2().__slots__) * 1000000 / 10**6
20.0 #megabytes

(With credit to mensi's answer, sys.getsizeof is not taking into account references. You can autocomplete to see most of the attributes of an object.)

See SO answer: Usage of __slots__? http://docs.python.org/release/2.5.2/ref/slots.html

To use __slots__:

class Test2():
    __slots__ = ['a','b','c','d','e']

    def __init__(self):
        ...
like image 75
ninjagecko Avatar answered Oct 26 '22 23:10

ninjagecko


Every instance references a dict for it's __dict__ which is 272 bytes on my machine for your example. Multiply that by 100'000.

like image 43
mensi Avatar answered Oct 26 '22 22:10

mensi