Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Size of list in memory

I just experimented with the size of python data structures in memory. I wrote the following snippet:

import sys lst1=[] lst1.append(1) lst2=[1] print(sys.getsizeof(lst1), sys.getsizeof(lst2)) 

I tested the code on the following configurations:

  • Windows 7 64bit, Python3.1: the output is: 52 40 so lst1 has 52 bytes and lst2 has 40 bytes.
  • Ubuntu 11.4 32bit with Python3.2: output is 48 32
  • Ubuntu 11.4 32bit Python2.7: 48 36

Can anyone explain to me why the two sizes differ although both are lists containing a 1?

In the python documentation for the getsizeof function I found the following: ...adds an additional garbage collector overhead if the object is managed by the garbage collector. Could this be the case in my little example?

like image 497
halex Avatar asked Aug 30 '11 17:08

halex


People also ask

How do I check memory list size?

In order to determine the size of the list, we have passed the list object to the getsizeof() function which on execution return the sizes of list1 and list2 in bytes along with garbage collector overhead. List1 and list2 are occupying 112 bytes and 104 bytes in the memory.

How many bytes does a list take?

An empty list takes 56 bytes, but each additional int adds just 8 bytes, where the size of an int is 28 bytes. A list that contains a long string takes just 64 bytes.

Are lists stored in memory?

List has a large memory. Tuple is stored in a single block of memory. Creating a tuple is faster than creating a list. Creating a list is slower because two memory blocks need to be accessed.

How is memory allocated for Python list?

Python keeps a pointer to this array and the array's length is stored in a list head structure. This makes indexing of a list independent of the size of the list or the value of the index. When items are appended or inserted the array of references is resized.


1 Answers

Here's a fuller interactive session that will help me explain what's going on (Python 2.6 on Windows XP 32-bit, but it doesn't matter really):

>>> import sys >>> sys.getsizeof([]) 36 >>> sys.getsizeof([1]) 40 >>> lst = [] >>> lst.append(1) >>> sys.getsizeof(lst) 52 >>>  

Note that the empty list is a bit smaller than the one with [1] in it. When an element is appended, however, it grows much larger.

The reason for this is the implementation details in Objects/listobject.c, in the source of CPython.

Empty list

When an empty list [] is created, no space for elements is allocated - this can be seen in PyList_New. 36 bytes is the amount of space required for the list data structure itself on a 32-bit machine.

List with one element

When a list with a single element [1] is created, space for one element is allocated in addition to the memory required by the list data structure itself. Again, this can be found in PyList_New. Given size as argument, it computes:

nbytes = size * sizeof(PyObject *); 

And then has:

if (size <= 0)     op->ob_item = NULL; else {     op->ob_item = (PyObject **) PyMem_MALLOC(nbytes);     if (op->ob_item == NULL) {         Py_DECREF(op);         return PyErr_NoMemory();     }     memset(op->ob_item, 0, nbytes); } Py_SIZE(op) = size; op->allocated = size; 

So we see that with size = 1, space for one pointer is allocated. 4 bytes (on my 32-bit box).

Appending to an empty list

When calling append on an empty list, here's what happens:

  • PyList_Append calls app1
  • app1 asks for the list's size (and gets 0 as an answer)
  • app1 then calls list_resize with size+1 (1 in our case)
  • list_resize has an interesting allocation strategy, summarized in this comment from its source.

Here it is:

/* This over-allocates proportional to the list size, making room * for additional growth.  The over-allocation is mild, but is * enough to give linear-time amortized behavior over a long * sequence of appends() in the presence of a poorly-performing * system realloc(). * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ... */ new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);  /* check for integer overflow */ if (new_allocated > PY_SIZE_MAX - newsize) {     PyErr_NoMemory();     return -1; } else {     new_allocated += newsize; } 

Let's do some math

Let's see how the numbers I quoted in the session in the beginning of my article are reached.

So 36 bytes is the size required by the list data structure itself on 32-bit. With a single element, space is allocated for one pointer, so that's 4 extra bytes - total 40 bytes. OK so far.

When app1 is called on an empty list, it calls list_resize with size=1. According to the over-allocation algorithm of list_resize, the next largest available size after 1 is 4, so place for 4 pointers will be allocated. 4 * 4 = 16 bytes, and 36 + 16 = 52.

Indeed, everything makes sense :-)

like image 169
Eli Bendersky Avatar answered Sep 20 '22 10:09

Eli Bendersky