Testing alternatives to <code>for _ in range(n)</code> (to execute some action <code>n</code> times, even if the action does not depend on the value of <code>n</code>) I noticed that there is another formulation of this pattern that is faster, <code>for _ in [""] * n</code>. For example: <pre class="prettyprint"><code>timeit('for _ in range(10^1000): pass', number=1000000) </code></pre> returns 16.4 seconds; whereas, <pre class="prettyprint"><code>timeit('for _ in [""]*(10^1000): pass', number=1000000) </code></pre> takes 10.7 seconds. Why is <code>[""] * 10^1000</code> so much faster than <code>range(10^1000)</code> in Python 3? All testing done using Python 3.3

When iterating over <code>range()</code>, objects for all integers between 0 and <code>n</code> are produced; this takes a (small) amount of time, even with small integers having been cached. The loop over <code>[None] * n</code> on the other hand produces <code>n</code> references to 1 object, and creating that list is a little faster. However, the <code>range()</code> object uses far less memory, and is more readable to boot, which is why people prefer using that. Most code doesn't have to squeeze every last drop from the performance. If you need to have that speed, you can use a custom iterable that takes no memory, using <code>itertools.repeat()</code> with a second argument: <pre class="prettyprint"><code>from itertools import repeat for _ in repeat(None, n): </code></pre> As for your timing tests, there are some problems with those. First of all, you made an error in your <code>['']*n</code> timing loop; you did not embed two quotes, you concatenated two strings and produced an empty list: <pre class="prettyprint"><code>>>> '['']*n' '[]*n' >>> []*100 [] </code></pre> That's going to be unbeatable in an iteration, as you iterated 0 times. You also didn't use large numbers; <code>^</code> is the binary XOR operator, not the power operator: <pre class="prettyprint"><code>>>> 10^1000 994 </code></pre> which means your test missed out on how long it'll take to create a large list of empty values. Using better numbers and <code>None</code> gives you: <pre class="prettyprint"><code>>>> from timeit import timeit >>> 10 ** 6 1000000 >>> timeit("for _ in range(10 ** 6): pass", number=100) 3.0651066239806823 >>> timeit("for _ in [None] * (10 ** 6): pass", number=100) 1.9346517859958112 >>> timeit("for _ in repeat(None, 10 ** 6): pass", 'from itertools import repeat', number=100) 1.4315521717071533 </code></pre>

Why is for _ in range(n) slower than for _ in [""]*n?

Tags:

python

python-internals

Testing alternatives to for _ in range(n) (to execute some action n times, even if the action does not depend on the value of n) I noticed that there is another formulation of this pattern that is faster, for _ in [""] * n.

For example:

timeit('for _ in range(10^1000): pass', number=1000000)

returns 16.4 seconds;

whereas,

timeit('for _ in [""]*(10^1000): pass', number=1000000)

takes 10.7 seconds.

Why is [""] * 10^1000 so much faster than range(10^1000) in Python 3?

All testing done using Python 3.3

819

asked May 22 '15 14:05

Dunedubby

1 Answers

When iterating over range(), objects for all integers between 0 and n are produced; this takes a (small) amount of time, even with small integers having been cached.

The loop over [None] * n on the other hand produces n references to 1 object, and creating that list is a little faster.

However, the range() object uses far less memory, and is more readable to boot, which is why people prefer using that. Most code doesn't have to squeeze every last drop from the performance.

If you need to have that speed, you can use a custom iterable that takes no memory, using itertools.repeat() with a second argument:

from itertools import repeat

for _ in repeat(None, n):

As for your timing tests, there are some problems with those.

First of all, you made an error in your ['']*n timing loop; you did not embed two quotes, you concatenated two strings and produced an empty list:

>>> '['']*n'
'[]*n'
>>> []*100
[]

That's going to be unbeatable in an iteration, as you iterated 0 times.

You also didn't use large numbers; ^ is the binary XOR operator, not the power operator:

>>> 10^1000
994

which means your test missed out on how long it'll take to create a large list of empty values.

Using better numbers and None gives you:

>>> from timeit import timeit
>>> 10 ** 6
1000000
>>> timeit("for _ in range(10 ** 6): pass", number=100)
3.0651066239806823
>>> timeit("for _ in [None] * (10 ** 6): pass", number=100)
1.9346517859958112
>>> timeit("for _ in repeat(None, 10 ** 6): pass", 'from itertools import repeat', number=100)
1.4315521717071533

answered Oct 26 '22 23:10

Martijn Pieters

Related questions
                            
                                Getting the Max Value from a Dictionary [duplicate]
                            
                                Python: How to get multiple variables from a URL in Flask?
                            
                                Python not finding file in the same directory
                            
                                Saving dictionary of header information using numpy.savez()
                            
                                sklearn linear regression for large data
                            
                                How to convert a CIDR prefix to a dotted-quad netmask in Python?
                            
                                Pandas Dataframe or Panel to 3d numpy array
                            
                                How do I make make spiral in python? [closed]
                            
                                skewing or shearing an image in python
                            
                                Getting Alembic Database Version Programmatically
                            
                                how to create a dataframe by repeating series multiple times?
                            
                                How do I run Python script from a subdirectory?
                            
                                Append binary file to another binary file
                            
                                Writing python (pandas) Data Frame to SQL Database Error
                            
                                Evaluating Jacobian at specific points using sympy
                            
                                Generating a retention cohort from a pandas dataframe
                            
                                sqlalchemy.exc.ResourceClosedError: This Connection is closed when inserting after select
                            
                                Get info string from scapy packet
                            
                                Conditionally enumerating items in python
                            
                                Python - iterating beginning with the middle of the list and then checking either side

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With