Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the derivative of a 2D function using FFT properties

I'm trying to use the FFT properties to get the i-th derivative of a 2D function - specifically a 2D Gaussian function. I would also like to do this numerically in MATLAB.

Actually, I don't have a clue of what I'm doing, but I've read a lot on the Internet and all of MATLAB help, and nothing seems to help me, so I'm going to ask here.

Here's what I have so far:

h = fspecial('gaussian', size(imr), 10);
imshow(h, []);
G = fft2(h);

This is the 0th derivative of the Gaussian of the size of imr, or 512 x 512. I know I have to do the inverse and multiply something, but I don't know what.

Can anybody can give me a clue about getting the i-th derivative of a 2D Gaussian function using FFT properties?

Thanks in advance

like image 679
Keydash Avatar asked Mar 22 '15 01:03

Keydash


People also ask

What does an FFT plot tell you?

The output of the FFT is a complex vector containing information about the frequency content of the signal. The magnitude tells you the strength of the frequency components relative to other components. The phase tells you how all the frequency components align in time.

What is the derivative of frequency?

The derivative of a sine wave of frequency f is a phase-shifted sine wave, or cosine wave, of the same frequency and with an amplitude that is proportional to f, as can be demonstrated in Wolfram Alpha.

How do you use FFT?

Y = fft( X ) computes the discrete Fourier transform (DFT) of X using a fast Fourier transform (FFT) algorithm. If X is a vector, then fft(X) returns the Fourier transform of the vector. If X is a matrix, then fft(X) treats the columns of X as vectors and returns the Fourier transform of each column.

How does FFT function work?

The FFT operates by decomposing an N point time domain signal into N time domain signals each composed of a single point. The second step is to calculate the N frequency spectra corresponding to these N time domain signals. Lastly, the N spectra are synthesized into a single frequency spectrum.


1 Answers

Disclaimer - Edit on 16th September, 2015

This post has significantly changed due to a small yet fundamental flaw in my understanding of the derivative operation in the frequency domain. A lot of this post has changed since the previous iteration.

I would like to thank Luis Mendo for helping me debug why this wasn't working for some other images other than this one that was presented in the original version of the post.


When it comes to discrete-time data, there is no such thing as the derivative. The derivative is an analytical tool that only exists in continuous-time. In discrete-time, we can only approximate by using differences. As such, I'm going to assume that you mean the difference operation rather than the derivative. One of the most common approximations to the derivative is to use the forward difference operation. Concretely, for a 1D discrete sequence x[n], the output sequence y[n] where the forward difference operation is applied is defined as:

y[n] = x[n+1] - x[n], for n = 1, 2, ..., M-1

M is the total number of samples in your discrete sequence. Naturally, there will be one less value due to the nature of the forward difference operation.

To accomplish the same thing in the frequency domain, you have to use the shift theorem from the properties of the Discrete Fourier Transform. Concretely, given the DFTed version of the signal x[n], which is X(k), the following property relates the time-domain and frequency-domain together:

Here, i represents the complex number, or the square root of -1, and k is the angular frequency in the frequency domain. F^{-1} is the inverse Fourier Transform of the signal X(k), which is the Fourier Transform of the time domain signal x[n]. M is also the length of the signal x[n] and m is the amount of shift you want to move the signal by. Therefore, this is saying that if you want to compute a shift operation by shifting by m positions, you simply need to take the Fourier Transform, element-by-element multiply each component by exp(-i*2*pi*m*k/M) where k is the index to a point in the Fourier Transform and take the inverse Fourier Transform of this intermediate result.

Take special note that the last element of the Fourier Transform is meaningless, because the difference operation results in a signal that has one less element and so it's important to remove the last element from this result.

With the above noted, to compute the approximation to the derivative we need to find the Fourier Transform of y[n] or Y(k), and can be computed like so:

Take note that the shift is -1 such that x[n+1] = x[n-(-1)]. Therefore, you would compute the Fourier Transform, multiply each corresponding value by exp(i*2*pi*k/M) - 1 and take the inverse.

It isn't so much of a stretch to go into 2D. Here I will assume we're doing the difference operation in both directions - horizontal and vertical. Remember, we have two variables that represent the horizontal and vertical spatial coordinates. We would also call this the spatial domain, rather than the time domain, because the variables are with accordance to our position in a 2D space. Going with the above formulation, going into 2D is very simply:

Here, u and v represent the spatial coordinates of the 2D discrete difference operation y[u,v]. U and V are the 2D frequency components, and M and N are the number of columns and rows of the 2D signal. Take special care that you are in fact doing a horizontal difference operation followed by a vertical difference operation. Specifically, you would do a horizontal difference operation for each row independently, followed by a vertical difference operation for each column separately. In fact, the order doesn't matter. You can do one followed by the other in whatever order you choose. What's important to note is that the 2D Fourier Transform of the signal stays the same and you're just multiplying each value by some complex constant. However, you need to ensure that you remove the last row and last column of the result because each operation results in a signal that is one point less than the original length of each row and each column due to the difference operation observation that we talked about.

Assuming that you already have the FFT of the function, you just have to take each 2D spatial coordinate and multiply by (exp(i*2*pi*U/M) - 1)*(exp(i*2*pi*V/N) - 1). For each 2D component of the FFT stored at (U,V), we use these same frequency coordinates and multiply that location by the product of those two things earlier. After, you would take the inverse FFT and we would compare this with the actual spatial-domain derivative. We should see that they're the same.

Take note that when you perform the 2D FFT, the frequencies are normalized such that they span between [-1,1] in both dimensions. We need to take this into account when showing the equivalency of the derivative operations in both domains. Also, to make things much more simpler, it would also be prudent if you shift the frequency components so that they appear in the centre of the image. When performing the 2D FFT, the components are positioned such that the DC component, or the origin is located at the top-left corner of the image. Let's shift everything to the centre by use of fftshift.

Now, let's start with something small. For the sake of this procedure, we assume that the size of each dimension is odd to allow for the shifting of the frequencies via fftshift to be easier and unambiguous. Let's say we had a 101 x 101 Gaussian filter, with a standard deviation of 10, and we take the FFT of this, then fftshift. So:

h = fspecial('gaussian', [101 101], 10);
hf = fftshift(fft2(h));

If we want to find the derivative, we define a grid of frequency points that span the size of the image in frequency domain, starting with 0 as the origin, which is now the centre. We can do this via meshgrid, and so:

[U,V] = meshgrid(-50:50, -50:50);

In general, this horizontal and vertical coordinates span from [-floor(cols/2),floor(cols/2)] for the horizontal and [-floor(rows/2),floor(rows/2)] for the vertical. Note that the above grid has the same dimensions as the image, but the centre is at (0,0), and we span from -50 to 50 in both dimensions.

Now, do the difference operation in frequency domain in both directions. Let's do this for both x and y, so the first derivative in both directions:

hf2 = (exp(1i*2*pi*U/101) - 1).*(exp(1i*2*pi*V/101) - 1).*hf;

Take note that we had to normalize the frequencies by dividing by 101 due to the fact that the frequencies of the FFT are normalized. 101 x 101 is due to the size of the image. If we want this back in time domain, simply take the inverse FFT. However, we need to undo the shift that we did with ifftshift before taking the inverse FFT. We also use real to eliminate any of the residual complex valued components due to numerical inaccuracies. You'll see that the output is complex valued, but the imaginary components are so small in magnitude that we can disregard them. As such:

out_freq = real(ifft2(ifftshift(hf2)));

Remember to also remove the last row and column:

out_freq = out_freq(1:end-1,1:end-1);

If you want to compare this output with the time-domain difference operation of the filter h, we can do this exactly by using diff over the rows, then using this result, we go over the columns. As such:

out_spatial = diff(diff(h, 1, 1), 1, 2);

The second parameter denotes what order the difference operation is. This is a first-order difference, so both the horizontal and vertical difference operations are set to 1. The third parameter tells you what dimension you're doing the differencing on. We're doing this over the columns first, then applying the intermediate result over the rows. We can then show what they both look like:

figure;
subplot(1,2,1);
imshow(out_spatial, []);
title('Derivative from spatial domain');
subplot(1,2,2);
imshow(out_freq, []);
title('Derivative from frequency domain');

And we get:

enter image description here

Cool! This agrees with what I know about the derivative of a Gaussian.

Bonus

As a little bit of a bonus, I'd like to point you to some great slides from Cornell University: http://www.cs.cornell.edu/courses/cs6670/2011sp/lectures/lec02_filter.pdf

Specifically, this is what the Gaussian looks like if you were to look at it in 3D, where we have both horizontal and vertical coordinates, and the Z value is the height of the Gaussian in 3D:

enter image description here

So, this is what happens if we took the derivative of both directions independently. The top shows you what happens spatially, and the bottom shows you what happens if we took a look at this directly above on a bird's eye view:

enter image description here

The most negative part of the surface is drawn in black, and the most positive part of the surface is drawn in white, and everything else is scaled in between in terms of intensity... so, if we took the spatial derivatives of one direction, then used this intermediate result and took the spatial derivative of the other direction, you'll see that we would get what we saw above. Play around with different m and n values and see what you get. This property I haven't seen used that often in practice, but it's certainly good to know.

like image 113
rayryeng Avatar answered Sep 27 '22 23:09

rayryeng