So I am aware that a convolution by FFT has a lower computational complexity than a convolution in real space. But what are the downsides of an FFT convolution?
Does the kernel size always have to match the image size, or are there functions that take care of this, for example in pythons numpy and scipy packages? And what about anti-aliasing effects?
For filter kernels longer than about 64 points, FFT convolution is faster than standard convolution, while producing exactly the same result.
This is very easy, because N-dimensional FFTs are already implemented in PyTorch. We simply use the built-in function, and compute the FFT along the last dimension of each Tensor.
FFT convolutions are based on the convolution theorem, which states that given two functions f
and g
, if Fd()
and Fi()
denote the direct and inverse Fourier transform, and *
and .
convolution and multiplication, then:
f*g = Fi(Fd(d).Fd(g))
To apply this to a signal f
and a kernel g
, there are some things you need to take care of:
f
and g
have to be of the same size for the multiplication step to be possible, so you need to zero-pad the kernel (or input, if the kernel is longer than it).If your signal and kernel sizes are f_l
and g_l
, doing a straightforward convolution in time domain requires g_l * (f_l - g_l + 1)
multiplications and (g_l - 1) * (f_l - g_l + 1)
additions.
For the FFT approach, you have to do 3 FFTs of size at least f_l + g_l
, as well as f_l + g_l
multiplications.
For large sizes of both f
and g
, the FFT is clearly superior with its n*log(n)
complexity. For small kernels, the direct approach may be faster.
scipy.signal
has both convolve
and fftconvolve
methods for you to play around. And fftconvolve
handles all the padding described above transparently for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With