A better way for a Python 'for' loop

Tags:

We all know that the common way of executing a statement a certain number of times in Python is to use a for loop.

The general way of doing this is,

# I am assuming iterated list is redundant. # Just the number of execution matters. for _ in range(count):     pass

I believe nobody will argue that the code above is the common implementation, however there is another option. Using the speed of Python list creation by multiplying references.

# Uncommon way. for _ in [0] * count:     pass

There is also the old while way.

i = 0 while i < count:     i += 1

I tested the execution times of these approaches. Here is the code.

import timeit  repeat = 10 total = 10  setup = """ count = 100000 """  test1 = """ for _ in range(count):     pass """  test2 = """ for _ in [0] * count:     pass """  test3 = """ i = 0 while i < count:     i += 1 """  print(min(timeit.Timer(test1, setup=setup).repeat(repeat, total))) print(min(timeit.Timer(test2, setup=setup).repeat(repeat, total))) print(min(timeit.Timer(test3, setup=setup).repeat(repeat, total)))  # Results 0.02238852552017738 0.011760978361696095 0.06971727824807639

I would not initiate the subject if there was a small difference, however it can be seen that the difference of speed is 100%. Why does not Python encourage such usage if the second method is much more efficient? Is there a better way?

The test is done with Windows 10 and Python 3.6.

Following @Tim Peters' suggestion,

. . . test4 = """ for _ in itertools.repeat(None, count):     pass """ print(min(timeit.Timer(test1, setup=setup).repeat(repeat, total))) print(min(timeit.Timer(test2, setup=setup).repeat(repeat, total))) print(min(timeit.Timer(test3, setup=setup).repeat(repeat, total))) print(min(timeit.Timer(test4, setup=setup).repeat(repeat, total)))  # Gives 0.02306803115612352 0.013021619340942758 0.06400113461638746 0.008105080015739174

Which offers a much better way, and this pretty much answers my question.

Why is this faster than range, since both are generators. Is it because the value never changes?

579

asked Oct 29 '17 02:10

Max Paython

2 Answers

The first method (in Python 3) creates a range object, which can iterate through the range of values. (It's like a generator object but you can iterate through it several times.) It doesn't take up much memory because it doesn't contain the entire range of values, just a current and a maximum value, where it keeps increasing by the step size (default 1) until it hits or passes the maximum.

Compare the size of range(0, 1000) to the size of list(range(0, 1000)): Try It Online!. The former is very memory efficient; it only takes 48 bytes regardless of the size, whereas the entire list increases linearly in terms of size.

The second method, although faster, takes up that memory I was talking about in the past one. (Also, it seems that although 0 takes up 24 bytes and None takes 16, arrays of 10000 of each have the same size. Interesting. Probably because they're pointers)

Interestingly enough, [0] * 10000 is smaller than list(range(10000)) by about 10000, which kind of makes sense because in the first one, everything is the same primitive value so it can be optimized.

The third one is also nice because it doesn't require another stack value (whereas calling range requires another spot on the call stack), though since it's 6 times slower, it's not worth that.

The last one might be the fastest just because itertools is cool that way :P I think it uses some C-library optimizations, if I remember correctly.

answered Sep 20 '22 13:09

hyper-neutrino

Using

for _ in itertools.repeat(None, count)     do something

is the non-obvious way of getting the best of all worlds: tiny constant space requirement, and no new objects created per iteration. Under the covers, the C code for repeat uses a native C integer type (not a Python integer object!) to keep track of the count remaining.

For that reason, the count needs to fit in the platform C ssize_t type, which is generally at most 2**31 - 1 on a 32-bit box, and here on a 64-bit box:

>>> itertools.repeat(None, 2**63) Traceback (most recent call last):     ... OverflowError: Python int too large to convert to C ssize_t  >>> itertools.repeat(None, 2**63-1) repeat(None, 9223372036854775807)

Which is plenty big for my loops ;-)

answered Sep 20 '22 13:09

Tim Peters

Related questions
                            
                                Why is ''.join() faster than += in Python?
                            
                                How do you use pip, virtualenv and Fabric to handle deployment?
                            
                                Proper shebang for Python script
                            
                                Python, what does an underscore before parenthesis do
                            
                                Concatenate Pandas columns under new multi-index level
                            
                                python @abstractmethod decorator
                            
                                Why are Python strings immutable? Best practices for using them
                            
                                "x not in" vs. "not x in" [duplicate]
                            
                                Why am I getting the error "connection refused" in Python? (Sockets)
                            
                                Use xml.etree.ElementTree to print nicely formatted xml files [duplicate]
                            
                                unable to create autoincrementing primary key with flask-sqlalchemy
                            
                                Interactive matplotlib plot with two sliders
                            
                                How to check if a string is a valid regex in Python?
                            
                                How do I write a null (no-op) contextmanager in Python?
                            
                                How to return custom JSON in Django REST Framework
                            
                                How to yield results from a nested generator function?
                            
                                What exactly are "containers" in python? (And what are all the python container types?)
                            
                                Skip over a value in the range function in python
                            
                                Python: Wait on all of `concurrent.futures.ThreadPoolExecutor`'s futures
                            
                                How to make an object properly hashable?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

A better way for a Python 'for' loop

Tags:

performance

python

time-complexity

python-3.x

Max Paython

People also ask

2 Answers

hyper-neutrino

Tim Peters

Recent Activity

Donate For Us