Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numerical error building up when computing derivative of function repeatedly with FFT

Tags:

c

rounding

fft

fftw

I have written a C program that uses FFTW to compute derivative (repeatedly) of a function. I am testing for simple sin(x) function. Each step computes the derivative of the answer of the previous step. I am observing that the error builds and after 20 steps the number is pure garbage. Attached is sample output. The answer (at a specific point) should be either 0, +1 or -1, but it is NOT.

---- out ----  data '(0) = 1.000000 -0.000000 
---- out ----  data '(1) = 0.000000 -0.000000 
---- out ----  data '(2) = -1.000000 0.000000 
---- out ----  data '(3) = -0.000000 0.000000 
---- out ----  data '(4) = 1.000000 -0.000000 
---- out ----  data '(5) = 0.000000 -0.000000 
---- out ----  data '(6) = -1.000000 0.000000 
---- out ----  data '(7) = -0.000000 0.000000 
---- out ----  data '(8) = 1.000000 -0.000000 
---- out ----  data '(9) = 0.000000 -0.000000 
---- out ----  data '(10) = -1.000000 0.000000 
---- out ----  data '(11) = -0.000000 0.000000 
---- out ----  data '(12) = 1.000000 -0.000002 
---- out ----  data '(13) = 0.000007 -0.000000 
---- out ----  data '(14) = -1.000000 0.000028 
---- out ----  data '(15) = -0.000113 0.000000 
---- out ----  data '(16) = 0.999997 -0.000444 
---- out ----  data '(17) = 0.001798 -0.000000 
---- out ----  data '(18) = -0.999969 0.007110 
---- out ----  data '(19) = -0.028621 0.000004  

I cannot figure out why error keeps growing. Any suggestion is deeply appreciated. I wrap real function into complex datatype and set the imaginary part to zero. Here is the code:

# include <stdlib.h>
# include <stdio.h>
# include <time.h>
# include <math.h>
# include <complex.h>
# include <fftw3.h>

int main ( int argc, char *argv[] ){
  int i;
  fftw_complex *in;
  fftw_complex *in2;
  fftw_complex *out;

  double pi = 3.14159265359;
  int nx = 8, k, t, tf = 20;
  double xi = 0, xf = 2*pi;
  double dx = (xf - xi)/((double)nx); // Step size

  complex double  *kx;

  fftw_plan plan_backward;
  fftw_plan plan_forward;

  in = fftw_malloc ( sizeof ( fftw_complex ) * nx );
  out = fftw_malloc ( sizeof ( fftw_complex ) * nx );
  in2 = fftw_malloc ( sizeof ( fftw_complex ) * nx );


  kx = malloc ( sizeof ( complex ) * nx );

  // only need it once, hence outside the loop
  for (k = 0; k < nx; k++){
     if (k < nx/2){
            kx[k] = I*2*pi*k/xf;
     } else if (k > nx/2){
           kx[k] = I*2*pi*(k-nx)/xf;
     } else if (k == nx/2){
        kx[k] = 0.0;
     }
  }

  // create plan outside the loop
  plan_forward = fftw_plan_dft_1d ( nx, in, out, FFTW_FORWARD, FFTW_ESTIMATE );
  plan_backward = fftw_plan_dft_1d ( nx, out, in2, FFTW_BACKWARD, FFTW_ESTIMATE );


  // initialize data
  for ( i = 0; i < nx; i++ )
  {
    in[i] = sin(i*dx) + I*0.0;  // note the complex notation.
  }
//-------------------- start time loop ---------------------------------------//

for (t = 0; t < tf; t++){
  // print input data 
  //for ( i = 0; i < nx; i++ ) { printf("initial data '(%f) = %f %f \n", i*dx, creal(in[i]), cimag(in[i]) ); }

  fftw_execute ( plan_forward );

  for ( i = 0; i < nx; i++ )
  {
    out[i] = out[i]*kx[i]; // for first order derivative
  }

  fftw_execute ( plan_backward );
  // normalize
  for ( i = 0; i < nx; i++ )
  {
    in2[i] = in2[i]/nx;
  }
    printf("---- out ----  data '(%d) = %f %f \n", t, creal(in2[0]), cimag(in2[0]) );
  // overwrite input array with this new output and loop over
  for ( i = 0; i < nx; i++ )
  {
    in[i] = in2[i];
  }
  // done with the curent loop.
}
//--------------------- end of loop ----------------------------------------//

  fftw_destroy_plan ( plan_forward );
  fftw_destroy_plan ( plan_backward );

  fftw_free ( in );
  fftw_free ( in2 );
  fftw_free ( out );

  return 0;
}

compiled with gcc source.c -lfftw3 -lm

Update: Here is the output with M_PI looping 25 times. Same error buildup.

---- out ----  data '(0) = 1.000000 0.000000 
---- out ----  data '(1) = -0.000000 -0.000000 
---- out ----  data '(2) = -1.000000 -0.000000 
---- out ----  data '(3) = 0.000000 0.000000 
---- out ----  data '(4) = 1.000000 0.000000 
---- out ----  data '(5) = -0.000000 -0.000000 
---- out ----  data '(6) = -1.000000 -0.000000 
---- out ----  data '(7) = 0.000000 0.000000 
---- out ----  data '(8) = 1.000000 0.000000 
---- out ----  data '(9) = -0.000000 -0.000000 
---- out ----  data '(10) = -1.000000 -0.000000 
---- out ----  data '(11) = 0.000000 0.000000 
---- out ----  data '(12) = 1.000000 0.000000 
---- out ----  data '(13) = -0.000000 -0.000000 
---- out ----  data '(14) = -1.000000 -0.000000 
---- out ----  data '(15) = 0.000000 0.000000 
---- out ----  data '(16) = 1.000000 0.000001 
---- out ----  data '(17) = -0.000002 -0.000000 
---- out ----  data '(18) = -0.999999 -0.000008 
---- out ----  data '(19) = 0.000033 0.000004 
---- out ----  data '(20) = 0.999984 0.000132 
---- out ----  data '(21) = -0.000527 -0.000069 
---- out ----  data '(22) = -0.999735 -0.002104 
---- out ----  data '(23) = 0.008427 0.001099 
---- out ----  data '(24) = 0.995697 0.033667 
like image 417
Amit Avatar asked Jan 25 '18 21:01

Amit


1 Answers

Indeed, refining the value of pi does not solve your issue, even if it improves the accuracy of the 20th derivative. The problem is that small errors are inflated by repeating the derivative filter. To limit this issue, introducing a low-pass filter can be suggested as well as using quad precision. Introducing the concept of condition number helps understanding the way the error is inflated and set the filter accordingly. Nevertheless, computing the 20th derivative is going to remain a nightmare because computing the 20th derivative is ill-conditionned: it is simply not possible, even for the cleanest experimental inputs...

1. Small errors always exist.

The derivative filter, also named ramp filter, inflate the high frequencies more than the small ones. The slightest error on the high frequencies is dramatically inflated by repeating the use of the ramp filter.

Let's look at the small initial errors by printing the frequencies by

printf("---- out ----  data '(%d) %d = %20g %20g \n", t,i, creal(out[i]), cimag(out[i]) );

As pi=3.14159265359 is used, you get:

---- out ----  data '(0) 0 =         -2.06712e-13                    0 
---- out ----  data '(0) 1 =          6.20699e-13                   -4 
---- out ----  data '(0) 2 =         -2.06823e-13          2.92322e-13 
---- out ----  data '(0) 3 =         -2.07053e-13          1.03695e-13 
---- out ----  data '(0) 4 =         -2.06934e-13                    0 
---- out ----  data '(0) 5 =         -2.07053e-13         -1.03695e-13 
---- out ----  data '(0) 6 =         -2.06823e-13         -2.92322e-13 
---- out ----  data '(0) 7 =          6.20699e-13                    4 

Due to the discontinuity induced by the missing digits of pi, there are small non-null values for all the frequencies and these values are inflated by taking the derivative.

As pi=M_PI is used, these initial errors are smaller, but still non-null:

---- out ----  data '(0) 0 =          1.14424e-17                    0 
---- out ----  data '(0) 1 =         -4.36483e-16                   -4 
---- out ----  data '(0) 2 =          1.22465e-16         -1.11022e-16 
---- out ----  data '(0) 3 =          1.91554e-16         -4.44089e-16 
---- out ----  data '(0) 4 =          2.33487e-16                    0 
---- out ----  data '(0) 5 =          1.91554e-16          4.44089e-16 
---- out ----  data '(0) 6 =          1.22465e-16          1.11022e-16 
---- out ----  data '(0) 7 =         -4.36483e-16                    4 

These small errors are inflated just like the previous ones and the problem is not entirely solved. Let's try to zero these frequencies during the first loop:

if(t==0){
     for (k = 0; k < nx; k++){
         if (k==1 || nx-k==1){
            out[k] = I*4.0;
         }else{
            out[k] =0.0;
         }
     }
}

This time the only non-null frequencies during the first loop t=0 are the correct ones. Let's look at the second loop:

---- out ----  data '(1) 0 =                    0                    0 
---- out ----  data '(1) 1 =                   -4                    0 
---- out ----  data '(1) 2 =                    0                    0 
---- out ----  data '(1) 3 =         -4.44089e-16                    0 
---- out ----  data '(1) 4 =                    0                    0 
---- out ----  data '(1) 5 =          4.44089e-16                    0 
---- out ----  data '(1) 6 =                    0                    0 
---- out ----  data '(1) 7 =                    4                    0 

Due to finite precision computations during the DFT backward/forward transform and scaling, small errors appear and are inflated. AGAIN.

2. To limit the growth of the error, filtering can be introduced.

Most experimental inputs are plaged by large high frequency noise, which can be reduced by applying a low-pass filter, such as the Butterworth filter. See https://www.hindawi.com/journals/ijbi/2011/693795/ for details and alternative. This filter is characterized by a cutting frequency kc and an exponent and the frequency response of the ramp filter is modified as follow:

    //parameters of Butterworth Filter:
    double kc=3;
    double n=16;
    // only need it once, hence outside the loop
    for (k = 0; k < nx; k++){
      if (k < nx/2){
        // add low pass filter:
        kx[k] = I*2*pi*k/xf;
        kx[k]*=1./(1.+pow(k/kc,n));
      } else if (k > nx/2){
        kx[k] = I*2*pi*(k-nx)/xf;
        kx[k]*=1./(1.+pow((nx-k)/kc,n));
      } else if (k == nx/2){
        kx[k] = 0.0;
      }
    }

Using these parameters, the error on the 20th derivative is decreased from 5.27e-7 to 1.22e-12.

Another improvement is possible by not comming back to the real space between derivatives. That way, a lot of rounding errors during floating point computations are avoided. In this particular case, zeroing the input frequencies ensures that the error remain null, but it is slightly artificial... From a practical viewpoint, if the input signal is provided in the real space, using a filter to compute the derivatives is almost required.

3. The error is growing due to the condition number of the derivative filter

The derivative is a linear application and it is characterized by a condition number. Let's say that the input is plagued by an error eps on all frequencies. If the first frequency is amplified by a factor alpha, the frequency k is amplified by a factor k*alpha. Hence, each time the derivative is applied, the signal-to-noise ratio is divided by a ratio kc (the largest frequency), named condition number. If the filter is repeated 20 times, the signal-to-noise ratio is divided by kc^20.

Double precision number are about eps=10e-14 precise : this is the best signal-to-noise ratio you can get! Most experimental inputs will be much worse than that. For instance, grey scale image are often sampled using 16bit=65536 grey levels. As a result, a grey scale image is at most eps=1/65536 precise. Similarly, typical audio bit depth is 24, corresponding to eps=6e-8. Quad precision can be advised for nearly analytical inputs, it precision being about esp=1e-34... Let's find the frequency such that kc^20*eps<1:

eps=10e-14    kc=5
eps=1/65536   kc=1
eps=1/2^24    kc=2
esp=1e-34     kc=44

Hence, if the input is double precision, at best, only 4 frequencies of the 20th derivative are going to be significant... All frequencies above the 4th must be filtered by a strong low-pass filter. Hence, using quad precision can be advised: see fftw's documentation to compile fftw for gcc's quad precision type __float128 linking against quadmath. If the input is an image, computing the 20th derivative is simply out of scope: none of the frequencies are ever going to be significant !

like image 153
francis Avatar answered Nov 19 '22 12:11

francis