Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is `namedtuple` really as efficient in memory usage as tuples? My test says NO

It is stated in the Python documentation that one of the advantages of namedtuple is that it is as memory-efficient as tuples.

To validate this, I used iPython with ipython_memory_usage. The test is shown in the images below:

The test shows that:

  • 10000000 instances of namedtuple used about 850 MiB of RAM
  • 10000000 tuple instances used around 73 MiB of RAM
  • 10000000 dict instances used around 570 MiB of RAM

So namedtuple used much more memory than tuple! Even more than dict!!

What do you think? Where did I go wrong?

like image 597
Ammar Alyousfi Avatar asked Dec 06 '16 19:12

Ammar Alyousfi


People also ask

What's the advantage of using NamedTuple instead of tuple?

This is because tuples are immutable. However, using a tuple may reduce the readability of your code as you cannot describe what each item in the tuple stands for. This is where NamedTuples can come in handy. A NamedTuple provides the immutability of a tuple, while also making your code easy to understand and use.

What is difference between NamedTuple and tuple?

Tuples are immutable, whether named or not. namedtuple only makes the access more convenient, by using names instead of indices. You can only use valid identifiers for namedtuple , it doesn't perform any hashing — it generates a new type instead.

Should I use NamedTuple?

In general, you can use namedtuple instances wherever you need a tuple-like object. Named tuples have the advantage that they provide a way to access their values using field names and the dot notation. This will make your code more Pythonic.

Which consumes more memory list or tuple?

List has a large memory. Tuple is stored in a single block of memory. Creating a tuple is faster than creating a list. Creating a list is slower because two memory blocks need to be accessed.


1 Answers

A simpler metric is to check the size of equivalent tuple and namedtuple objects. Given two roughly analogous objects:

from collections import namedtuple
import sys

point = namedtuple('point', 'x y z')
point1 = point(1, 2, 3)

point2 = (1, 2, 3)

Get the size of them in memory:

>>> sys.getsizeof(point1)
72

>>> sys.getsizeof(point2)
72

They look the same to me...


Taking this a step further to replicate your results, notice that if you create a list of identical tuples the way you're doing it, each tuple is the exact same object:

>>> test_list = [(1,2,3) for _ in range(10000000)]
>>> test_list[0] is test_list[-1]
True

So in your list of tuples, each index contains a reference the same object. There are not 10000000 tuples, there are 10000000 references to one tuple.

On the other hand, your list of namedtuple objects actually does create 10000000 unique objects.

A better apples-to-apples comparison would be to view the memory usage for

>>> test_list = [(i, i+1, i+2) for i in range(10000000)]

and:

>>> test_list_n = [point(x=i, y=i+1, z=i+2) for i in range(10000000)]

They have the same size:

>>> sys.getsizeof(test_list)
81528056

>>> sys.getsizeof(test_list_n)
81528056
like image 129
Billy Avatar answered Oct 13 '22 13:10

Billy