I have a medium-amount of base objects.
These base objects will be put in collections, and these collections will be munged around: sorted, truncated, etc.
Unfortunately, the n is large enough that memory consumption is slightly worrisome, and speed is getting concerning.
My understanding is that tuples are slightly more memory-efficient, since they are deduplicated.
Anyway, I would like to know what the cpu/memory tradeoffs of lists vs. tuples are in Python 2.6/2.7.
Internally, tuples are stored a little more efficiently than lists, and also tuples can be accessed slightly faster.
Tuples are stored in a single block of memory. Tuples are immutable so, It doesn't require extra space to store new objects. Lists are allocated in two blocks: the fixed one with all the Python object information and a variable-sized block for the data. It is the reason creating a tuple is faster than List.
If you have a tuple and a list with the same elements, the tuple takes less space. Since tuples are immutable, you can't sort them, add to them, etc. I recommend watching this talk by Alex Gaynor for a quick intro on when to choose what datastructure in Python.
UPDATE: Thinking about it some more, you may want to look into optimizing the space usage of your objects, e.g., via __slots__
or using namedtuple
instances as proxies instead of the actual objects. This would likely lead to much bigger savings, since you have N of them and (presumbaly) only a few collections in which they appear. namedtuple
in particular is super awesome; check out Raymond Hettinger's talk.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With