I am trying to implement a vision algorithm, which includes a prefiltering stage with a 9x9 Laplacian-of-Gaussian filter. Can you point to a document which explains fast filter implementations briefly? I think I should make use of FFT for most efficient filtering.
FFT is the fastest technique known for convolving signals, and FFTW is the fastest free library available for computing the FFT.
The 2D convolution is a fairly simple operation at heart: you start with a kernel, which is simply a small matrix of weights. This kernel “slides” over the 2D input data, performing an elementwise multiplication with the part of the input it is currently on, and then summing up the results into a single output pixel.
C++ Algorithm for Convolution 2D We need 4 nested loops for 2D convolution instead of 2 loops in 1D convolution. The above snippet code is simple and easiest way to understand how convolution works in 2D.
Given two arrays A[] and B[] consisting of N and M integers respectively, the task is to construct a convolution array C[] of size (N + M – 1). The convolution of 2 arrays is defined as C[i + j] = ∑(a[i] * b[j]) for every i and j.
Are you sure you want to use FFT? That will be a whole-array transform, which will be expensive. If you've already decided on a 9x9 convolution filter, you don't need any FFT.
Generally, the cheapest way to do convolution in C is to set up a loop that moves a pointer over the array, summing the convolved values at each point and writing the data to a new array. This loop can then be parallelised using your favourite method (compiler vectorisation, MPI libraries, OpenMP, etc).
Regarding the boundaries:
The 4 points are because the maximum boundary overlap of a 9x9 kernel is 4 points outside the main grid. Thus, n points of border needed for a 2n+1 x 2n+1 kernel.
If you need this convolution to be really fast, and/or your grid is large, consider partitioning it into smaller pieces that can be held in the processor's cache, and thus calculated far more quickly. This also goes for any GPU-offloading you might want to do (they are ideal for this type of floating-point calculation).
Here is a theory link http://hebb.mit.edu/courses/9.29/2002/readings/c13-1.pdf
And here is a link to fftw, which is a pretty good FFT library that I've used in the past (check licenses to make sure it is suitable) http://www.fftw.org/
All you do is FFT your image and kernel (the 9x9 matrix). Multiply together, then back transform.
However, with a 9x9 matrix you may still be better doing it in real coordinates (just with a double loop over the image pixels and the matrix). Try both ways!
Actually you don't need to use a FFT size large enough to hold the entire image. You can do a lot of smaller overlapping 2d ffts. You can search for "fast convolution" "overlap save" "overlap add".
However, for a 9x9 kernel. You may not see much advantage speedwise.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With