Pre-allocating a list of None

Tags:

Suppose you want to write a function which yields a list of objects, and you know in advance the length n of such list.

In python the list supports indexed access in O(1), so it is arguably a good idea to pre-allocate the list and access it with indexes instead of allocating an empty list and using the append() method. This is because we avoid the burden of expanding the whole list if the space is not enough.

If I'm using python, probably performances are not that relevant in any case, but what is the better way of pre-allocating a list?

I know these possible candidates:

[None] * n → allocating two lists
[None for x in range(n)] — or xrange in python2 → building another object

Is one significantly better than the other?

What if we are in the case n = len(input)? Since input exists already, would [None for x in input] have better performances w.r.t. [None] * len(input)?

869

asked Mar 06 '14 13:03

Dacav

2 Answers

In between those two options the first one is clearly better as no Python for loop is involved.

>>> %timeit [None] * 100
1000000 loops, best of 3: 469 ns per loop
>>> %timeit [None for x in range(100)] 
100000 loops, best of 3: 4.8 us per loop

Update:

And list.append has an O(1) complexity too, it might be a better choice than pre-creating list if you assign the list.append method to a variable.

>>> n = 10**3
>>> %%timeit
lis = [None]*n           
for _ in range(n):
    lis[_] = _
... 
10000 loops, best of 3: 73.2 us per loop
>>> %%timeit
lis = []                 
for _ in range(n):
    lis.append(_)
... 
10000 loops, best of 3: 92.2 us per loop
>>> %%timeit
lis = [];app = lis.append
for _ in range(n):
    app(_)
... 
10000 loops, best of 3: 59.4 us per loop

>>> n = 10**6
>>> %%timeit
lis = [None]*n
for _ in range(n):
    lis[_] = _
... 
10 loops, best of 3: 106 ms per loop
>>> %%timeit
lis = []      
for _ in range(n):
    lis.append(_)
... 
10 loops, best of 3: 122 ms per loop
>>> %%timeit
lis = [];app = lis.append
for _ in range(n):
    app(_)
... 
10 loops, best of 3: 91.8 ms per loop

172

answered Oct 22 '22 08:10

Ashwini Chaudhary

When you append an item to a list, Python 'over-allocates', see the source-code of the list object. This means that for example when adding 1 item to a list of 8 items, it actually makes room for 8 new items, and uses only the first one of those. The next 7 appends are then 'for free'.

In many languages (e.g. old versions of Matlab, the newer JIT might be better) you are always told that you need to pre-allocate your vectors, since appending during a loop is very expensive. In the worst case, appending of a single item to a list of length n can cost O(n) time, since you might have to create a bigger list and copy all the existing items over. If you need to do this on every iteration, the overall cost of adding n items is O(n^2), ouch. Python's pre-allocation scheme spreads the cost of growing the array over many single appends (see amortized costs), effectively making the cost of a single append O(1) and the overall cost of adding n items O(n).

Additionally, the overhead of the rest of your Python code is usually so large, that the tiny speedup that can be obtained by pre-allocating is insignificant. So in most cases, simply forget about pre-allocating, unless your profiler tells you that appending to a list is a bottleneck.

The other answers show some profiling of the list preallocation itself, but this is useless. The only thing that matters is profiling your complete code, with all your calculations inside your loop, with and without pre-allocation. If my prediction is right, the difference is so small that the computation time you win is dwarfed by the time spent thinking about, writing and maintaining the extra lines to pre-allocate your list.

answered Oct 22 '22 07:10

Bas Swinckels

Related questions
                            
                                What is the fastest and most efficient way to append rows to a DataFrame?
                            
                                Random Choice with Pytorch?
                            
                                Switching Python version installed by Homebrew
                            
                                Unexpected output of 'arch' on OSX (using Mac M1 installing elastic beans)
                            
                                django is very slow on my machine
                            
                                why is __init__ module in django project loaded twice
                            
                                What's the best practice for handling single-value tuples in Python?
                            
                                Sort a list of dicts by dict values
                            
                                Running python on a Windows machine vs Linux
                            
                                basics of python encryption w/ hashlib sha1
                            
                                Unable to install Python and GDAL (DLL load failed)
                            
                                Matplotlib : Comma separated number format for axis
                            
                                python: how to tell if file executed as import vs. main script?
                            
                                Unable to encode/decode pprint output
                            
                                Turn off the upper/right axis tick marks
                            
                                python closure with assigning outer variable inside inner function
                            
                                MongoDB insert raises duplicate key error
                            
                                How to my "exe" from PyCharm project [duplicate]
                            
                                cx_Oracle: ImportError: DLL load failed: This application has failed
                            
                                Writing a csv file into SQL Server database using python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pre-allocating a list of None

Tags:

performance

python

list

design-patterns

python-3.x

Dacav

People also ask

2 Answers

Ashwini Chaudhary

Bas Swinckels

Recent Activity

Donate For Us