It is stated in the Python documentation that one of the advantages of namedtuple
is that it is as memory-efficient as tuples.
To validate this, I used iPython with ipython_memory_usage. The test is shown in the images below:
The test shows that:
10000000
instances of namedtuple
used about 850 MiB
of RAM10000000
tuple
instances used around 73 MiB
of RAM10000000
dict
instances used around 570 MiB
of RAMSo namedtuple
used much more memory than tuple
! Even more than dict
!!
What do you think? Where did I go wrong?
This is because tuples are immutable. However, using a tuple may reduce the readability of your code as you cannot describe what each item in the tuple stands for. This is where NamedTuples can come in handy. A NamedTuple provides the immutability of a tuple, while also making your code easy to understand and use.
Tuples are immutable, whether named or not. namedtuple only makes the access more convenient, by using names instead of indices. You can only use valid identifiers for namedtuple , it doesn't perform any hashing — it generates a new type instead.
In general, you can use namedtuple instances wherever you need a tuple-like object. Named tuples have the advantage that they provide a way to access their values using field names and the dot notation. This will make your code more Pythonic.
List has a large memory. Tuple is stored in a single block of memory. Creating a tuple is faster than creating a list. Creating a list is slower because two memory blocks need to be accessed.
A simpler metric is to check the size of equivalent tuple
and namedtuple
objects. Given two roughly analogous objects:
from collections import namedtuple
import sys
point = namedtuple('point', 'x y z')
point1 = point(1, 2, 3)
point2 = (1, 2, 3)
Get the size of them in memory:
>>> sys.getsizeof(point1)
72
>>> sys.getsizeof(point2)
72
They look the same to me...
Taking this a step further to replicate your results, notice that if you create a list of identical tuples the way you're doing it, each tuple
is the exact same object:
>>> test_list = [(1,2,3) for _ in range(10000000)]
>>> test_list[0] is test_list[-1]
True
So in your list of tuples, each index contains a reference the same object. There are not 10000000 tuples, there are 10000000 references to one tuple.
On the other hand, your list of namedtuple
objects actually does create 10000000 unique objects.
A better apples-to-apples comparison would be to view the memory usage for
>>> test_list = [(i, i+1, i+2) for i in range(10000000)]
and:
>>> test_list_n = [point(x=i, y=i+1, z=i+2) for i in range(10000000)]
They have the same size:
>>> sys.getsizeof(test_list)
81528056
>>> sys.getsizeof(test_list_n)
81528056
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With