Why does this line of Python
yy = [sum(y[i:i+5])/5. for i in range(len(y)-4)]
run some 20 times faster than the following (equivalent) code?
for i in xrange(0,len(y)-4):
yy = np.append(yy, sum(y[i:i+5])/5.)
Where y is a large array of reals. What actually is going on under the hood here? Many thanks.
A faster way to loop in Python is using built-in functions. In our example, we could replace the for loop with the sum function. This function will sum the values inside the range of numbers.
Array Computations are Faster than For Loops Actually, it is a bad practice in Python to use for loops, list comprehensions, or . apply() in pandas. Instead, you should always prefer array computations. It only took 4.84 seconds!
Performance Benchmark Also, the execution speed varies significantly between the fastest contestant and the looser while loop: for-each loops are more than six times faster than while loops. Even the for-range loop is nearly two times faster than the while loop.
while loops scale the best for large arrays. for...of loops are hands down the fastest when it comes to small data sets, but they scale poorly for large data sets.
The two codes are not equivalent. The correct equivalent version is:
yy = []
for i in range(0,len(y)-4):
yy.append(sum(y[i:i+5])/5.)
Which takes about the same time:
In [10]: y = [1.0] * 100000
In [11]: %timeit [sum(y[i:i+5])/5. for i in range(len(y)-4)]
10 loops, best of 3: 49.6 ms per loop
In [12]: %%timeit yy = []
...: for i in range(0,len(y)-4):
...: yy.append(sum(y[i:i+5])/5.)
...:
10 loops, best of 3: 55.1 ms per loop
The problem is the call to numpy.append
which is much slower than list.append
.
This is probably due to the fact that numpy.append
is creating a copy of the array and returning it for every insertion.
The first insertion costs 2
(allocate space for 1 element and copy it there). The seconds costs 3
(allocate space for 2 elements, copy the lone element and the new one). The third costs 4
(allocate for 3, copy 2 elements and the new one). etc.
This means that the algorithm suddenly became O(n^2)
, while it is O(n)
using python list
s since they do not copy the whole list for every append
. They are smart enough to allocate more memory to accommodate more elements.
Also, as a general rule, numpy
does not shine for single-element accesses. It's actually slower than pure python in that case, because it has to convert between machine data types and python objects all the time. Try to vectorize the operations and you'll see big speed ups.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With