I am trying to replicate the outcome of this link using linear convolution in spatial-domain.
Images are first converted to 2d double
arrays and then convolved. Image and kernel are of the same size. The image is padded before convolution and cropped accordingly after the convolution.
As compared to the FFT-based convolution, the output is weird and incorrect.
How can I solve the issue?
Note that I obtained the following image output from Matlab which matches my C# FFT output:
.
Update-1: Following @Ben Voigt's comment, I changed the Rescale()
function to replace 255.0
with 1
and thus the output is improved substantially. But, still, the output doesn't match the FFT output (which is the correct one).
.
Update-2: Following @Cris Luengo's comment, I have padded the image by stitching and then performed spatial convolution. The outcome has been as follows:
So, the output is worse than the previous one. But, this has a similarity with the 2nd output of the linked answer which means a circular convolution is not the solution.
.
Update-3: I have used the Sum()
function proposed by @Cris Luengo's answer. The result is a more improved version of **Update-1**
:
But, it is still not 100% similar to the FFT version.
.
Update-4: Following @Cris Luengo's comment, I have subtracted the two outcomes to see the difference:
,
1. spatial minus frequency domain
2. frequency minus spatial domain
Looks like, the difference is substantial which means, spatial convolution is not being done correctly.
.
Source Code:
(Notify me if you need more source code to see.)
public static double[,] LinearConvolutionSpatial(double[,] image, double[,] mask)
{
int maskWidth = mask.GetLength(0);
int maskHeight = mask.GetLength(1);
double[,] paddedImage = ImagePadder.Pad(image, maskWidth);
double[,] conv = Convolution.ConvolutionSpatial(paddedImage, mask);
int cropSize = (maskWidth/2);
double[,] cropped = ImageCropper.Crop(conv, cropSize);
return conv;
}
static double[,] ConvolutionSpatial(double[,] paddedImage1, double[,] mask1)
{
int imageWidth = paddedImage1.GetLength(0);
int imageHeight = paddedImage1.GetLength(1);
int maskWidth = mask1.GetLength(0);
int maskHeight = mask1.GetLength(1);
int convWidth = imageWidth - ((maskWidth / 2) * 2);
int convHeight = imageHeight - ((maskHeight / 2) * 2);
double[,] convolve = new double[convWidth, convHeight];
for (int y = 0; y < convHeight; y++)
{
for (int x = 0; x < convWidth; x++)
{
int startX = x;
int startY = y;
convolve[x, y] = Sum(paddedImage1, mask1, startX, startY);
}
}
Rescale(convolve);
return convolve;
}
static double Sum(double[,] paddedImage1, double[,] mask1, int startX, int startY)
{
double sum = 0;
int maskWidth = mask1.GetLength(0);
int maskHeight = mask1.GetLength(1);
for (int y = startY; y < (startY + maskHeight); y++)
{
for (int x = startX; x < (startX + maskWidth); x++)
{
double img = paddedImage1[x, y];
double msk = mask1[x - startX, y - startY];
sum = sum + (img * msk);
}
}
return sum;
}
static void Rescale(double[,] convolve)
{
int imageWidth = convolve.GetLength(0);
int imageHeight = convolve.GetLength(1);
double maxAmp = 0.0;
for (int j = 0; j < imageHeight; j++)
{
for (int i = 0; i < imageWidth; i++)
{
maxAmp = Math.Max(maxAmp, convolve[i, j]);
}
}
double scale = 1.0 / maxAmp;
for (int j = 0; j < imageHeight; j++)
{
for (int i = 0; i < imageWidth; i++)
{
double d = convolve[i, j] * scale;
convolve[i, j] = d;
}
}
}
public static Bitmap ConvolveInFrequencyDomain(Bitmap image1, Bitmap kernel1)
{
Bitmap outcome = null;
Bitmap image = (Bitmap)image1.Clone();
Bitmap kernel = (Bitmap)kernel1.Clone();
//linear convolution: sum.
//circular convolution: max
uint paddedWidth = Tools.ToNextPow2((uint)(image.Width + kernel.Width));
uint paddedHeight = Tools.ToNextPow2((uint)(image.Height + kernel.Height));
Bitmap paddedImage = ImagePadder.Pad(image, (int)paddedWidth, (int)paddedHeight);
Bitmap paddedKernel = ImagePadder.Pad(kernel, (int)paddedWidth, (int)paddedHeight);
Complex[,] cpxImage = ImageDataConverter.ToComplex(paddedImage);
Complex[,] cpxKernel = ImageDataConverter.ToComplex(paddedKernel);
// call the complex function
Complex[,] convolve = Convolve(cpxImage, cpxKernel);
outcome = ImageDataConverter.ToBitmap(convolve);
outcome = ImageCropper.Crop(outcome, (kernel.Width/2)+1);
return outcome;
}
1. A term used to identify the linear combination of a series of discrete 2D data (a digital image) with a few coefficients or weights. In the Fourier theory, a convolution in space is equivalent to (spatial) frequency filtering.
Spatial domain — enhancement of the image space that divides an image into uniform pixels according to the spatial coordinates with a particular resolution. The spatial domain methods perform operations on pixels directly. Frequency domain — enhancement obtained by applying the Fourier Transform to the spatial domain.
Convolution is a general purpose filter effect for images. □ Is a matrix applied to an image and a mathematical operation. comprised of integers. □ It works by determining the value of a central pixel by adding the. weighted values of all its neighbors together.
Image enhancement has two methods: spatial domain method and frequency domain method. Spatial domain method is mainly used to directly compute gray levels of pixels in spatial domain, such as the change of gray levels, histogram correction, etc.
Your current output looks more like the auto-correlation function than the convolution of Lena with herself. I think the issue might be in your Sum
function.
If you look at the definition of the convolution sum, you'll see that the kernel (or the image, doesn't matter) is mirrored:
sum_m( f[n-m] g[m] )
For the one function, m
appears with a plus sign, and for the other it appears with a minus sign.
You'll need to modify your Sum
function to read the mask1
image in the right order:
static double Sum(double[,] paddedImage1, double[,] mask1, int startX, int startY)
{
double sum = 0;
int maskWidth = mask1.GetLength(0);
int maskHeight = mask1.GetLength(1);
for (int y = startY; y < (startY + maskHeight); y++)
{
for (int x = startX; x < (startX + maskWidth); x++)
{
double img = paddedImage1[x, y];
double msk = mask1[maskWidth - x + startX - 1, maskHeight - y + startY - 1];
sum = sum + (img * msk);
}
}
return sum;
}
The other option is to pass a mirrored version of mask1
to this function.
I have found the solution from this link. The main clue was to introduce an offset
and a factor
.
.
@Cris Luengo's answer also raised a valid point.
.
The following source code is supplied in the given link:
private void SafeImageConvolution(Bitmap image, ConvMatrix fmat)
{
//Avoid division by 0
if (fmat.Factor == 0)
return;
Bitmap srcImage = (Bitmap)image.Clone();
int x, y, filterx, filtery;
int s = fmat.Size / 2;
int r, g, b;
Color tempPix;
for (y = s; y < srcImage.Height - s; y++)
{
for (x = s; x < srcImage.Width - s; x++)
{
r = g = b = 0;
// Convolution
for (filtery = 0; filtery < fmat.Size; filtery++)
{
for (filterx = 0; filterx < fmat.Size; filterx++)
{
tempPix = srcImage.GetPixel(x + filterx - s, y + filtery - s);
r += fmat.Matrix[filtery, filterx] * tempPix.R;
g += fmat.Matrix[filtery, filterx] * tempPix.G;
b += fmat.Matrix[filtery, filterx] * tempPix.B;
}
}
r = Math.Min(Math.Max((r / fmat.Factor) + fmat.Offset, 0), 255);
g = Math.Min(Math.Max((g / fmat.Factor) + fmat.Offset, 0), 255);
b = Math.Min(Math.Max((b / fmat.Factor) + fmat.Offset, 0), 255);
image.SetPixel(x, y, Color.FromArgb(r, g, b));
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With