`in range` construction in python 2 --- working too slow

Q: What is faster than range in Python?

In this case, the for loop is faster, but also more elegant compared to while. Please, have in mind that you can't apply list comprehensions in all cases when you need loops. Some more complex situations require the ordinary for or even while loops.

Q: Is range a lazy Python?

So what is range? The range object is “lazy” in a sense because it doesn't generate every number that it “contains” when we create it. Instead it gives those numbers to us as we need them when looping over it. If you're looking for a description for range objects, you could call them “lazy sequences”.

Q: How do you speed up a loop in Python?

A faster way to loop using built-in functions A faster way to loop in Python is using built-in functions. In our example, we could replace the for loop with the sum function. This function will sum the values inside the range of numbers. The code above takes 0.84 seconds.

Q: Are loops slow in Python?

Looping over Python arrays, lists, or dictionaries, can be slow. Thus, vectorized operations in Numpy are mapped to highly optimized C code, making them much faster than their standard Python counterparts.

Tags:

performance

python

range

python-2.7

overhead

I wanted to check if a given x lies in the interval [0,a-1]. As a lazy coder I wrote

x in range(a)

and (because that piece of code was in 4.5 nested loops) quickly run into performance issues. I tested it and indeed, it turns out runtime of n in range(n) lies in O(n), give or take. I actually thought my code would be optimized to x >= 0 and x < a but it seems this is not the case. Even if I fix the range(a) in advance the time doesn't become constant (though it improves a lot) - see side notes.

So, my question is:

Should I use x >= 0 and x < a and never write x in range(a) ever again? Is there an even better way of writing it?

Side notes:

I tried searching SO for range, python-2.7, performance tags put together, found nothing (same with python-2.x).
If I try following:
```
i = range(a)
...
x in i
```
so that the range is fixed and I only measure runtime of x in i, I still get runtime in O(x) (assuming a is large enough).
runtime of n in xrange(n) lies in O(n) as well.
I found this post, which asks similar question for python 3. I decided to test same stuff on python 3 and it rolled through tests like it's nothing. I got sad for python 2.

372

asked Feb 27 '18 04:02

mike239x

1 Answers

The problem with range in Python 2 is that it creates a list of values, so x in range(a) will create a list and linearly scan that list. xrange should be a generator, but it is not much faster; probably still just linearly scans the values, just without creating the entire list first.

In [2]: %timeit 5*10**5 in range(10**6 + 1)  # Python 2
10 loops, best of 3: 18.1 ms per loop

In [3]: %timeit 5*10**5 in xrange(10**6 + 1) # Python 2
100 loops, best of 3: 6.21 ms per loop

In Python 3, range is much smarter, not only not creating the entire list, but also providing a fast implementation of the contains check.

In [1]: %timeit 5*10**5 in range(10**6 + 1)  # Python 3
1000000 loops, best of 3: 324 ns per loop

Even faster and IMHO more readable: Using comparison chaining:

In [2]: %timeit 0 <= 5*10**5 < 10**6 + 1     # Python 2 or 3
10000000 loops, best of 3: 46.6 ns per loop

Should I use x >= 0 and x < a and never write x in range(a) ever again? Is there an even better way of writing it?

"No", "it depends", and "yes". You should not use x >= 0 and x < a because 0 <= x < a is shorter and easier to parse (for puny humans), and is interpreted as (0 <= x) and (x < a). And you should not use in range in Python 2, but in Python 3, you can use it if you like.

Still, I'd prefer comparison chaining, since a <= x < b is much more explicit about bounds than x in range(a, b) (what if x == b?), which could prevent many off-by-one errors or +1 padding the range.

Also, note that 0 <= x < a is not strictly the same as x in range(0, a), as a range will only ever contain integer values, i.e. 1.5 in range(0, 5) is False, whereas 0 <= 1.5 < 5 is True, which may of may not what you want. Also, using range you can use steps other than 1, e.g. 5 in range(4, 10, 2) is False, but the same can also be implemented using pure math, e.g. as (4 <= x < 10) and (x - 4 % 2 == 0).

115

answered Sep 23 '22 22:09

tobias_k

Related questions
                            
                                How to hash int/long using hashlib in Python?
                            
                                SED or AWK script to replace multiple text
                            
                                Get all except first and last n elements of a numpy array
                            
                                Difference between Scrapy selectors "a::text" and "a ::text"
                            
                                Pygame cause for low FPS. How can the performance be improved?
                            
                                How to generate json using Jinja2 Template Engine
                            
                                Flask-WTF set time limit on CSRF token
                            
                                Calculating event based on the continuous timestep
                            
                                Use multiple output stream in python?
                            
                                python exchangelib how to get email within specific date range?
                            
                                Python3 Typerror: replace() argument 1 must be str, not int
                            
                                Difference between Python2 and Python3 while using map with find and index
                            
                                How to mock elastic search python?
                            
                                Count each group sequentially pandas
                            
                                Python mocking, return different value every call
                            
                                Django REST Framework - How to return 404 error instead of 403
                            
                                IOPub Error on Google Colaboratory in Jupyter Notebook
                            
                                not equals operator(!=) not working in python string comparison
                            
                                How do you have a bot add a reaction using a custom emoji?
                            
                                Python: Check if a directory exists using os module

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With