Implementing a low pass FIR filter, when should one use FFT and IFFT instead of time-domain convolution?
The goal is to achieve the lowest CPU time required for real-time calculations. As I know, FFT has about O(n log n) complexity, but convolution in the time domain is of O(n²) complexity. To implement a low pass filter in the frequency domain, one should use FFT, then multiply each value with filtering coefficients (which are translated into frequency domain), then make IFFT.
So, the question is when it is justified to use frequency-based (FFT+IFFT) filtering instead of using direct convolution based FIR filter? Say, if one have 32 fixed-point coefficients, should FFT+IFFT be used or not? How about 128 coefficients? And so on...
Trying to optimize an existed source code (convolution-based FIR filter), I am totally confused, either I should to use FFT or just optimize it to use SSE or not.
FFT convolution uses the overlap-add method together with the Fast Fourier Transform, allowing signals to be convolved by multiplying their frequency spectra. For filter kernels longer than about 64 points, FFT convolution is faster than standard convolution, while producing exactly the same result.
18.2 FFT Filters Low-pass filters block all frequency components above the cutoff frequency, allowing only the low frequency components to pass. High-pass filters are just the opposite and block frequency components that are below the cutoff frequency.
An ideal low-pass filter can be realized mathematically (theoretically) by multiplying a signal by the rectangular function in the frequency domain or, equivalently, convolution with its impulse response, a sinc function, in the time domain.
The frequency contents of the signal and their powers can be obtained through operations such as the Fast Fourier Transform (FFT). A low-pass filter passes relatively low frequency components in the signal but stops the high frequency components. The so-called cutoff frequency divides the pass band and the stop band.
Convolution is actually O(m*n) where m is the width of the finite impulse response, and N the sample window.
So the tipping point of m where it is useful to change to FFT+IFFT is related to log(N) of the sample window.
In realtime operation the fact that FFT is batch oriented might be more important than the relative amount of clock cycles, as it may not be acceptable to wait 1024 sample points before filtering, if the application is in a regulation loop for example.
Now a lot of development has been done is this area and plenty of code is available, so trying a couple of solutions and benchmarking is key here.
It depends on what architecture you're using and various other factors, but the general "rule of thumb" for 1D DSP is that if the filter size is small, say less than 100 terms, you're probably better off with direct convolution, but for larger filter sizes it may be worth doing fast convolution in the frequency domain.
Of course you need to be certain that the filtering is a performance bottleneck first, as there is no point going to all the extra effort of doing fast convolution if your time domain implementation is fast enough already.
Bottom line: start with simple direct convolution and then switch to fast convolution later if you need it. (You'll need to keep the first implementation for validating the second implementation, so it's not wasted effort, by any means.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With