I recently converted a MATLAB script to Python with Numpy, and found that it ran significantly slower. I expected similar performance, so I'm wondering if I'm doing something wrong.
As stripped-down example, I manually sum a geometric series:
MATLAB version:
function s = array_sum(a, array_size, iterations)
s = zeros(array_size);
for m = 1:iterations
s = a + 0.5*s;
end
end
% benchmark code
array_size = 500
iterations = 500
a = randn(array_size)
f = @() array_sum(a, array_size, iterations);
fprintf('run time: %.2f ms\n', timeit(f)*1e3);
Python/Numpy version:
import numpy as np
import timeit
def array_sum(a, array_size, iterations):
s = np.zeros((array_size, array_size))
for m in range(iterations):
s = a + 0.5*s
return s
array_size = 500
iterations = 500
a = np.random.randn(array_size, array_size)
timeit_iterations = 10
t1 = timeit.timeit(lambda: array_sum(a, array_size, iterations),
number=timeit_iterations)
print("run time: {:.2f} ms".format(1e3*t1/timeit_iterations))
On my machine, MATLAB completes in 58 ms. The Python version runs in 292 ms, or 5X slower.
I also tried speeding up the Python code by adding the Numba JIT decorator @jit('f8[:,:](i8, i8)', nopython=True)
, but the time only dropped to 236 ms (4X slower).
This is slower than I expected. Am I using timeit improperly? Is there something wrong with my Python code?
EDIT: edited so that the random matrix is created outside of benchmarked function.
EDIT 2: I ran the benchmark using Torch instead of Numpy (calculating the sum as s = torch.add(s, 0.5, a)
) and it runs in just 52 ms on my computer!
From my experience, when using numba's jit function it's usually faster to expand array operations into loops. So I tried to rewrite your python function as:
@jit(nopython=True, cache=True)
def array_sum_numba(a, array_size, iterations):
s = np.zeros((array_size, array_size))
for m in range(iterations):
for i in range(array_size):
for j in range(array_size):
s[i,j] = a[i,j] + 0.5 * s[i,j]
return s
And out of curiosity, I've also tested @percusse's version with a little modification on the parameter:
def array_sum2(r, array_size, iterations):
s = np.zeros((array_size, array_size))
for m in range(iterations):
s /= 2
s += r
return s
The testing results on my machine are:
This result is within my expectation. It's worthing mentioning that I've increased timeit iterations to 50, which results in some significant time reduction for numba version.
In summary: The Python code can still be significantly accelerated if you use numba's jit and write the function in loops. I don't have Matlab on my machine to test, but my guess is with numba the python version is faster.
Since you are updating the same variable suitable for inplace operations, you can update your function as
def array_sum2(array_size, iterations):
s = np.zeros((array_size, array_size))
r = np.random.randn(array_size, array_size)
for m in range(iterations):
s /= 2
s += r
return s
This has given the following speed benefit on my machine compared to array_sum
run time: 157.32 ms
run time2: 672.43 ms
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With