Performing convolution along Z vector of a 3d numpy array, then other operations on the results, but it is slow as it is implemented now. Is the for loop what is slowing me down here or is is the convolution? I tried reshaping to a 1d vector and perform the convolution in 1 pass (as I did in Matlab), without the for loop, but it doesn't improve performance. My Matlab version is about 50% faster than anything I can come up with in Python. Relevant section of code:
convolved=np.zeros((y_lines,x_lines,z_depth))
for i in range(0, y_lines):
for j in range(0, x_lines):
convolved[i,j,:]= fftconvolve(data[i,j,:], Gauss) #80% of time here
result[i,j,:]= other_calculations(convolved[i,j,:]) #20% of time here
Is there a better way to do this than a for loop? Heard of Cython but I have limited experience in Python as of now, would aim for the simplest solution.
The fftconvolve
function you are using is presumably from SciPy. If so, be aware that it takes N-dimensional arrays. So a faster way to do your convolution would be to generate the 3d kernel that corresponds to doing nothing in the x
and y
dimensions and doing a 1d gaussian convolution in z.
Some code and timing results are below. On my machine and with some toy data, this led to a 10× speedup, as you can see:
import numpy as np
from scipy.signal import fftconvolve
from scipy.ndimage.filters import gaussian_filter
# use scipy filtering functions designed to apply kernels to isolate a 1d gaussian kernel
kernel_base = np.ones(shape=(5))
kernel_1d = gaussian_filter(kernel_base, sigma=1, mode='constant')
kernel_1d = kernel_1d / np.sum(kernel_1d)
# make the 3d kernel that does gaussian convolution in z axis only
kernel_3d = np.zeros(shape=(1, 1, 5,))
kernel_3d[0, 0, :] = kernel_1d
# generate random data
data = np.random.random(size=(50, 50, 50))
# define a function for loop based convolution for easy timeit invocation
def convolve_with_loops(data):
nx, ny, nz = data.shape
convolved=np.zeros((nx, ny, nz))
for i in range(0, nx):
for j in range(0, ny):
convolved[i,j,:]= fftconvolve(data[i, j, :], kernel_1d, mode='same')
return convolved
# compute the convolution two diff. ways: with loops (first) or as a 3d convolution (2nd)
convolved = convolve_with_loops(data)
convolved_2 = fftconvolve(data, kernel_3d, mode='same')
# raise an error unless the two computations return equivalent results
assert np.all(np.isclose(convolved, convolved_2))
# time the two routes of the computation
%timeit convolved = convolve_with_loops(data)
%timeit convolved_2 = fftconvolve(data, kernel_3d, mode='same')
timeit
results:
10 loops, best of 3: 198 ms per loop
100 loops, best of 3: 18.1 ms per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With