What are the memory requirements for large python list?

Tags:

I was doing a foolish thing like:

from itertools import *
rows = combinations(range(0, 1140), 17)
all_rows = []
for row in rows:
    all_rows.append(row)

No surprise; I run out of memory address space (32 bit python 3.1) My question is: how do I calculate how much memory address space I will need for a large list? In this case the list is on the order of 2.3X10^37. Is there a function in Python that returns the information I am looking for, or actually the size of a smaller but similar list? What are those tools?

814

asked Dec 31 '09 16:12

Vincent

2 Answers

There's a handy function sys.getsizeof() (since Python 2.6) that helps with this:

>>> import sys
>>> sys.getsizeof(1)  # integer
12
>>> sys.getsizeof([]) # empty list
36
>>> sys.getsizeof(()) # empty tuple
28
>>> sys.getsizeof((1,))  # tuple with one element
32

From that you can see that each integer takes up 12 bytes, and the memory for each reference in a list or tuple is 4 bytes (on a 32-bit machine) plus the overhead (36 or 28 bytes respectively).

If your result has tuples of length 17 with integers, then you'd have 17*(12+4)+28 or 300 bytes per tuple. The result itself is a list, so 36 bytes plus 4 per reference. Find out how long the list will be (call it N) and you have 36+N*(4+300) as the total bytes required.

Edit: There's one other thing that could significantly affect that result. Python creates new integer objects as required for most integer values, but for small ones (empirically determined to be the range [-5, 256] on Python 2.6.4 on Windows) it pre-creates them all and re-uses them. If a large portion of your values are less than 257 this would significantly reduce the memory consumption. (On Python 257 is not 257+0 ;-)).

140

answered Oct 19 '22 20:10

Peter Hansen

Well first rather than writing:

all_rows = []
for row in rows:
    all_rows.append(row)

You can simply write:

all_rows = list(rows)

which will be quite a bit more efficient.

Then, there are two things to consider for memory consumption of a list:

memory consumption of the objects comprising the list; this obviously depends on these objects, their type, and whether there is a lot of sharing or not
memory consumption of the list itself; each object in the list is referenced by a pointer, which will take 4 bytes in 32-bit mode and 8 bytes in 64-bit mode; so, roughly, the size of the list itself is (4 or 8 bytes) times the number of objects in the list (that's ignoring the fixed list header overhead and the moderate amount of over-allocation that Python lists do)

By the way, in recent Python versions you can use sys.getsizeof() to get the size of an object:

>>> import sys
>>> sys.getsizeof([None] * 100)
872

answered Oct 19 '22 21:10

Antoine P.

Related questions
                            
                                split item to rows pandas
                            
                                Check if a given key is contained in any of multiple dictionaries
                            
                                Invalid syntax on importing nltk in python 2.7
                            
                                Trying to Understand FB Prophet Cross Validation
                            
                                How to split a list into two random parts [duplicate]
                            
                                Python Structural Pattern Matching
                            
                                Could not load library cudnn_cnn_infer64_8.dll. Error code 126
                            
                                Newbie Python Question about tuples
                            
                                python - readable list of objects
                            
                                Your most unpythonic code snippet [closed]
                            
                                How to 'zoom' in on a section of the Mandelbrot set?
                            
                                Crunching xml with python
                            
                                Delete all files/directories except two specific directories
                            
                                python3.0: imputils
                            
                                Which credentials should I put in for Google App Engine BulkLoader at development server?
                            
                                python string replacement with % character/**kwargs weirdness
                            
                                Python and factories
                            
                                Is there any "remote console" for twisted server?
                            
                                Find (and keep) duplicates of sublist in python
                            
                                String formatting expressions (Python)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What are the memory requirements for large python list?

Tags:

python

memory-management

32bit-64bit

Vincent

People also ask

2 Answers

Peter Hansen

Antoine P.

Recent Activity

Donate For Us