Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: are objects more memory-hungry than dictionaries?

Tags:

python

Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
# RAM usage: 2100
>>> class Test:
...     def __init__(self, i):
...             self.one = i
...             self.hundred = 100*i
...
# RAM usage: 2108
>>> list1 = [ Test(i) for i in xrange(10000) ]
# RAM usage: 4364
>>> del(list1)
# RAM usage: 2780
>>> list2 = [ {"one": i, "hundred": 100*i} for i in xrange(10000) ]
# RAM usage: 3960
>>> del(list2)
# RAM usage: 2908

Why does a list of objects take twice as much memory as a list of equivalent dictionaries? I thought an object would be much more efficient since there is no need to store copies of attribute names for each object.

like image 918
Nikolai Avatar asked Dec 21 '22 09:12

Nikolai


2 Answers

If you define a class in Python (as opposed to writing it as C extension) then by default it will use a dictionary to store all of its attributes. This is why it's impossible for it to be smaller than a dictionary, and why you can assign arbitrary attributes to most Python objects.

If you know know in advance which attributes your object will require, you can specify them with the __slots__ attribute[docs] on your class. This allows Python to be more efficient and not require an entire dictionary for each object. In your case, you could do this by adding

__slots__ = ["one", "hundred"]

on the line below class Test:. However, I'd be a little surprised if this were enough to make the objects smaller than the dictionaries; Python's dictionaries are highly optimized for use with a small number of values. (edit: I am a little surprised, apparently it does make them smaller than dictionaries.)

like image 55
Jeremy Avatar answered Dec 24 '22 03:12

Jeremy


Python implements object attribute lookup using dictionaries, i.e. when you ask for someObject.x what this gets converted to under the hood is someObject.__dict__["x"]. (And yes, you can type that in - the underlying dictionary is accessible using the __dict__ attribute name).

So, first off, the attribute names actually are stored once per object instance (remember - Python doesn't know for sure that every object in a class has the same attributes with the same names!). Second off, in addition to storing that dictionary, there's a bit of extra data that goes into an object (such as a pointer to its class methods) that a dictionary doesn't have to deal with.

like image 36
azernik Avatar answered Dec 24 '22 03:12

azernik