Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the memory requirements for large python list?

I was doing a foolish thing like:

from itertools import *
rows = combinations(range(0, 1140), 17)
all_rows = []
for row in rows:
    all_rows.append(row)

No surprise; I run out of memory address space (32 bit python 3.1) My question is: how do I calculate how much memory address space I will need for a large list? In this case the list is on the order of 2.3X10^37. Is there a function in Python that returns the information I am looking for, or actually the size of a smaller but similar list? What are those tools?

like image 814
Vincent Avatar asked Dec 31 '09 16:12

Vincent


People also ask

How much memory do Python lists use?

When you create a list object, the list object by itself takes 64 bytes of memory, and each item adds 8 bytes of memory to the size of the list because of references to other objects.

How much memory should Python use?

Those numbers can easily fit in a 64-bit integer, so one would hope Python would store those million integers in no more than ~8MB: a million 8-byte objects. In fact, Python uses more like 35MB of RAM to store these numbers. Why? Because Python integers are objects, and objects have a lot of memory overhead.

How much memory can Python allocate?

The issue is that 32-bit python only has access to ~4GB of RAM. This can shrink even further if your operating system is 32-bit, because of the operating system overhead. For example, in Python 2 zip function takes in multiple iterables and returns a single iterator of tuples.

How is memory allocated to a list in Python?

Python keeps a pointer to this array and the array's length is stored in a list head structure. This makes indexing of a list independent of the size of the list or the value of the index. When items are appended or inserted the array of references is resized.


2 Answers

There's a handy function sys.getsizeof() (since Python 2.6) that helps with this:

>>> import sys
>>> sys.getsizeof(1)  # integer
12
>>> sys.getsizeof([]) # empty list
36
>>> sys.getsizeof(()) # empty tuple
28
>>> sys.getsizeof((1,))  # tuple with one element
32

From that you can see that each integer takes up 12 bytes, and the memory for each reference in a list or tuple is 4 bytes (on a 32-bit machine) plus the overhead (36 or 28 bytes respectively).

If your result has tuples of length 17 with integers, then you'd have 17*(12+4)+28 or 300 bytes per tuple. The result itself is a list, so 36 bytes plus 4 per reference. Find out how long the list will be (call it N) and you have 36+N*(4+300) as the total bytes required.

Edit: There's one other thing that could significantly affect that result. Python creates new integer objects as required for most integer values, but for small ones (empirically determined to be the range [-5, 256] on Python 2.6.4 on Windows) it pre-creates them all and re-uses them. If a large portion of your values are less than 257 this would significantly reduce the memory consumption. (On Python 257 is not 257+0 ;-)).

like image 140
Peter Hansen Avatar answered Oct 19 '22 20:10

Peter Hansen


Well first rather than writing:

all_rows = []
for row in rows:
    all_rows.append(row)

You can simply write:

all_rows = list(rows)

which will be quite a bit more efficient.

Then, there are two things to consider for memory consumption of a list:

  • memory consumption of the objects comprising the list; this obviously depends on these objects, their type, and whether there is a lot of sharing or not
  • memory consumption of the list itself; each object in the list is referenced by a pointer, which will take 4 bytes in 32-bit mode and 8 bytes in 64-bit mode; so, roughly, the size of the list itself is (4 or 8 bytes) times the number of objects in the list (that's ignoring the fixed list header overhead and the moderate amount of over-allocation that Python lists do)

By the way, in recent Python versions you can use sys.getsizeof() to get the size of an object:

>>> import sys
>>> sys.getsizeof([None] * 100)
872
like image 23
Antoine P. Avatar answered Oct 19 '22 21:10

Antoine P.