I am trying to perform addition in an efficient way in python over large loops . I am trying to loop over a range of 100000000.
from datetime import datetime
start_time = datetime.now()
sum = 0
for i in range(100000000):
sum+=i
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(sum)
The output from the above code is --- %s seconds ---0:00:16.662666 4999999950000000
When i try to do it in C, its taking 0.43 seconds
From what i read, python creates new memory everytime when you perform addition to variable. I read some articles and came to know how to perform string concatenation in these situations by avoiding '+' sign . But i dont find anything how to do with integers.
A faster way to loop in Python is using built-in functions. In our example, we could replace the for loop with the sum function. This function will sum the values inside the range of numbers.
For loop can be iterated on generators in Python. While loop cannot be iterated on Generators directly. For loop with range() uses 3 operations. range() function is implemented in C, so, its faster.
This article compares the performance of Python loops when adding two lists or arrays element-wise. The results show that list comprehensions were faster than the ordinary for loop, which was faster than the while loop. The simple loops were slightly faster than the nested loops in all three cases.
An implied loop in map() is faster than an explicit for loop; a while loop with an explicit loop counter is even slower. Avoid calling functions written in Python in your inner loop.
This article compares the performance of Python loops when adding two lists or arrays element-wise. The results show that list comprehensions were faster than the ordinary for loop, which was faster than the while loop. The simple loops were slightly faster than the nested loops in all three cases.
For example, the general advice is to use optimized Python built-in or third-party routines, usually written in C or Cython. Besides, it’s faster to work with local variables than with globals, so it’s a good practice to copy a global variable to a local before the loop. And so on.
Often performance issues arise when using Python loops, especially with a large number of iterations. There is a number of useful tricks to improve your code and make it run faster, but that’s beyond the scope here. This article compares the performance of several approaches when summing two sequences element-wise:
Let’s first see some simple Python loops in action. We’ll start with two lists with 1.000 elements each. The integer variable n represents the length of each list. The lists x and y are obtained by randomly choosing n elements from r:
Consider using the sum()
function if you can process the list as a whole, which loops entirely in C code and is much faster, and also avoids the creation of new Python objects.
sum(range(100000000))
In my computer, your code takes 07.189210
seconds, while the above statement takes 02.751251
seconds, increasing the processing speed more than 3 times.
Edit: as suggested by mtrw, numpy.sum() can speed up processing even more.
Here is a comparison of three methods: your original way, using sum(range(100000000))
as suggested by Alex Metsai, and using the NumPy numerical library's sum
and range
functions:
from datetime import datetime
import numpy as np
def orig():
start_time = datetime.now()
sum = 0
for i in range(100000000):
sum+=i
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(sum)
def pyway():
start_time = datetime.now()
mysum = sum(range(100000000))
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(mysum)
def npway():
start_time = datetime.now()
sum = np.sum(np.arange(100000000))
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(sum)
On my computer, I get:
>>> orig()
--- %s seconds ---0:00:09.504018
4999999950000000
>>> pyway()
--- %s seconds ---0:00:02.382020
4999999950000000
>>> npway()
--- %s seconds ---0:00:00.683411
4999999950000000
NumPy is the fastest, if you can use it in your application.
But, as suggested by Ethan in a comment, it's worth pointing out that calculating the answer directly is by far the fastest:
def mathway():
start_time = datetime.now()
mysum = 99999999*(99999999+1)/2
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(mysum)
>>> mathway()
--- %s seconds ---0:00:00.000013
4999999950000000.0
I assume your actual problem is not so easily solved by pencil and paper :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With