Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The result of own double precision cos() implemention in a shader is NaN, but works well on the CPU. What is going wrong?

as i said, i want implement my own double precision cos() function in a compute shader with GLSL, because there is just a built-in version for float.

This is my code:

double faculty[41];//values are calculated at the beginning of main()

double myCOS(double x)
{
    double sum,tempExp,sign;
    sum = 1.0;
    tempExp = 1.0;
    sign = -1.0;

    for(int i = 1; i <= 30; i++)
    {
        tempExp *= x;
        if(i % 2 == 0){
            sum = sum + (sign * (tempExp / faculty[i]));
            sign *= -1.0;
        }
    }
return sum;
}

The result of this code is, that the sum turns out to be NaN on the shader, but on the CPU the algorithm is working well. I tried to debug this code too and I got the following information:

  • faculty[i] is positive and not zero for all entries
  • tempExp is positive in each step
  • none of the other variables are NaN during each step
  • the first time sum is NaN is at the step with i=4

and now my question: What exactly can go wrong if each variable is a number and nothing is divided by zero especially when the algorithm works on the CPU?

like image 621
DanceIgel Avatar asked Mar 05 '15 11:03

DanceIgel


1 Answers

Let me guess:

First you determined the problem is in the loop, and you use only the following operations: +, *, /.

The rules for generating NaN from these operations are:

  • The divisions 0/0 and ±∞/±∞
  • The multiplications 0×±∞ and ±∞×0
  • The additions ∞ + (−∞), (−∞) + ∞ and equivalent subtractions

You ruled out the possibility for 0/0 and ±∞/±∞ by stating that faculty[] is correctly initialized.

The variable sign is always 1.0 or -1.0 so it cannot generate the NaN through the * operation.

What remains is the + operation if tempExp ever become ±∞.

So probably tempExp is too high on entry of your function and becomes ±∞, this will make sum to be ±∞ too. At the next iteration you will trigger the NaN generating operation through: ∞ + (−∞). This is because you multiply one side of the addition by sign and sign switches between positive and negative at each iteration.

You're trying to approximate cos(x) around 0.0. So you should use the properties of the cos() function to reduce your input value to a value near 0.0. Ideally in the range [0, pi/4]. For instance, remove multiples of 2*pi, and get the values of cos() in [pi/4, pi/2] by computing sin(x) around 0.0 and so on.

like image 102
fjardon Avatar answered Sep 18 '22 23:09

fjardon