Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rescaling Complex data after FFT Convolution

I have tested two rescaling functions by applying them on FFT convolution outputs.

The first one is collected from this link.

    public static void RescaleComplex(Complex[,] convolve)
    {
        int imageWidth = convolve.GetLength(0);
        int imageHeight = convolve.GetLength(1);

        double maxAmp = 0.0;
        for (int i = 0; i < imageWidth; i++)
        {
            for (int j = 0; j < imageHeight; j++)
            {
                maxAmp = Math.Max(maxAmp, convolve[i, j].Magnitude);
            }
        }
        double scale = 1.0 / maxAmp;
        for (int i = 0; i < imageWidth; i++)
        {
            for (int j = 0; j < imageHeight; j++)
            {
                convolve[i, j] = new Complex(convolve[i, j].Real * scale,
                    convolve[i, j].Imaginary * scale);
            }
        }
    }

enter image description here

Here the problem is incorrect contrast.

The second one is collected from this link.

    public static void RescaleComplex(Complex[,] convolve)
    {            
        int imageWidth = convolve.GetLength(0);
        int imageHeight = convolve.GetLength(1);

        double scale = imageWidth * imageHeight;

        for (int j = 0; j < imageHeight; j++)
        {
            for (int i = 0; i < imageWidth; i++)
            {
                double re = Math.Max(0.0, Math.Min(convolve[i, j].Real * scale, 1.0));
                double im = Math.Max(0.0, Math.Min(convolve[i, j].Imaginary * scale, 1.0));
                convolve[i, j] = new Complex(re, im);
            }
        }
    }

enter image description here

Here the output is totally white.

So, you can see two of the versions are giving one correct and another incorrect outputs.

How can I solve this dilemma?

.

Note. Matrix is the following kernel:

 0  -1   0 
-1   5  -1 
 0  -1   0

Source Code. Here is my FFT Convolution function.

    private static Complex[,] ConvolutionFft(Complex[,] image, Complex[,] kernel)
    {
        Complex[,] imageCopy = (Complex[,])image.Clone();
        Complex[,] kernelCopy = (Complex[,])kernel.Clone();
        Complex[,] convolve = null;

        int imageWidth = imageCopy.GetLength(0);
        int imageHeight = imageCopy.GetLength(1);

        int kernelWidth = kernelCopy.GetLength(0);
        int kernelHeight = kernelCopy.GetLength(1);

        if (imageWidth == kernelWidth && imageHeight == kernelHeight)
        {
            Complex[,] fftConvolved = new Complex[imageWidth, imageHeight];

            Complex[,] fftImage = FourierTransform.ForwardFFT(imageCopy);
            Complex[,] fftKernel = FourierTransform.ForwardFFT(kernelCopy);                

            for (int j = 0; j < imageHeight; j++)
            {
                for (int i = 0; i < imageWidth; i++)
                {
                    fftConvolved[i, j] = fftImage[i, j] * fftKernel[i, j];
                }
            }

            convolve = FourierTransform.InverseFFT(fftConvolved);

            RescaleComplex(convolve);

            convolve = FourierShifter.ShiftFft(convolve);
        }
        else
        {
            throw new Exception("Padded image and kernel dimensions must be same.");
        }

        return convolve;
    }
like image 344
user366312 Avatar asked Jul 24 '18 18:07

user366312


1 Answers

This is not really a dilemma. This is just an issue of the limited range of the display, and of your expectations, which are different in the two cases.

  1. (top): this is a normalized kernel (its elements sum up to 1). It doesn't change the contrast of the image. But because of negative values in it, it can generate values outside the original range.

  2. (bottom): this is not a normalized kernel. It changes the contrast of the output.

For example, play around with the kernel

 0, -1,  0
-1,  6, -1
 0, -1,  0

(notice the 6 in the middle). It sums up to 2. The image contrast will be doubled. That is, in a region where the input is all 0, the output is 0 as well, but where the input is all 1, the output will be 2 instead.

Typically, a convolution filter, if it is not meant to change image contrast, is normalized. If you apply such a filter, you don't need to re-scale the output for display (though you might want to clip out-of-range values if they appear). However, it is possible that the out-of-range values are relevant, in this case you need to re-scale the output to match the display range.

In your case 2 (the image kernel), you could normalize the kernel to avoid re-scaling the output. But this is not a solution in general. Some filters add up to 0 (e.g. the Sobel kernels or the Laplace kernel, both of which are based on derivatives which remove the DC component). These cannot be normalized, you will always have to re-scale the output image for display (though you wouldn't re-scale their output for analysis, since their output values have a physical meaning that is destroyed upon re-scaling).

That is to say, the convolution sometimes is meant to produce an output image with the same contrast (within approximately the same range) as the input image, and sometimes it isn't. You need to know what filter you are applying for the output to make sense, and to be able to display the output on a screen that expects images to be in a specific range.


EDIT: explanation of what is going on in your figures.

1st figure: Here you are rescaling so that the full image intensity range is visible. Logically here you don't get any saturated pixels. But because the matrix kernel enhances high frequencies, the output image has values outside the original range. Rescaling to fit the full range within the display's range reduces the contrast of the image.

2nd figure: You are rescaling the frequency-domain convolution result by N = imageWidth * imageHeight. This yields the right output. That you need to apply this scaling indicates that your forward FFT scales by 1/N, and your inverse FFT doesn't scale.

For IFFT(FFT(img))==img, it is necessary that either the FFT or the IFFT are scaled by 1/N. Typically it is the IFFT that is scaled. The reason is that then the convolution does as expected without any further scaling. To see this, imagine an image where all pixels have the same value. FFT(img) will be zero everywhere except for the 0 frequency component (DC component), which will be sum(img). The normalized kernel sums up to 1, so its DC component is sum(kernel)==1. Multiply these two, we obtain again a frequency spectrum like the input's, with a DC component of sum(img). Its inverse transform will be equal to img. This is exactly what we expect for this convolution.

Now, use the other form of normalization (i.e. the one used by the FFT you have access to). The DC component of FFT(img) will be sum(img)/N. The DC component of the kernel will be 1/N. Multiply these two, and obtain a DC component of sum(img)/(N*N). Its inverse transform will be equal to img/N. Thus, you need to multiply by N to obtain the expected result. This is exactly what you're seeing in your frequency-domain convolution for the "matrix kernel", which is normalized.

As I mentioned above, the "image kernel" isn't normalized. The DC component of FFT(kernel) is sum(img)/N, the multiplication of that by FFT(img) has a DC component sum(img)*sum(img)/(N*N), and so the inverse transform has a contrast multiplied by sum(img)/N, multiplying by N still leaves you with a factor sum(img) too large. If you were to normalize the kernel, you would be dividing it by sum(img), which would bring your output into the expected range.

like image 170
Cris Luengo Avatar answered Oct 12 '22 15:10

Cris Luengo