Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numerical precision for difference of squares

in my code I often compute things like the following piece (here C code for simplicity):

float cos_theta = /* some simple operations; no cosf call! */;
float sin_theta = sqrtf(1.0f - cos_theta * cos_theta); // Option 1

For this example ignore that the argument of the square root might be negative due to imprecisions. I fixed that with additional fdimf call. However, I wondered if the following is more precise:

float sin_theta = sqrtf((1.0f + cos_theta) * (1.0f - cos_theta)); // Option 2

cos_theta is between -1 and +1 so for each choice there will be situations where I subtract similar numbers and thus will loose precision, right? What is the most precise and why?

like image 845
cschwan Avatar asked Aug 13 '13 15:08

cschwan


People also ask

How do you prove the difference of two squares?

The difference of two squares identity is ( a + b ) ( a − b ) = a 2 − b 2 (a+b)(a-b)=a^2-b^2 (a+b)(a−b)=a2−b2.

What is a floating point accuracy test?

The fputest checks the functionality of the floating point unit in CPUs. The test verifies the functionality by various arithmetic operations. In addition, the fputest stresses the CPU with the use of benchmarks. Both single and double precision numbers are used for the operations.

What are some of the common causes of loss of accuracy in numerical calculations?

Addition and Subtraction might necessitate mantissa shift to make exponents match. This can cause the loss of some (or all) digits in one operand. Multiplication The product of two n-digit numbers is a 2n-digit number. Again, digits are lost, the product might not even be representable.

How is loss of significance calculated?

We use the quadratic formula to solve for "h". The quadratic formula itself can be a cause of w:loss of significance if the quantity "4ac" is very small. This can be remedied by not subtracting. x 1 = − b − b 2 − 4 a c 2 a or,in this case h 1 = − 2 x − 4 x 2 + 4 x 2 2 = − ( 1 + 2 ) x .


1 Answers

The most precise way with floats is likely to compute both sin and cos using a single x87 instruction, fsincos.

However, if you need to do the computation manually, it's best to group arguments with similar magnitudes. This means the second option is more precise, especially when cos_theta is close to 0, where precision matters the most.

As the article What Every Computer Scientist Should Know About Floating-Point Arithmetic notes:

The expression x2 - y2 is another formula that exhibits catastrophic cancellation. It is more accurate to evaluate it as (x - y)(x + y).

Edit: it's more complicated than this. Although the above is generally true, (x - y)(x + y) is slightly less accurate when x and y are of very different magnitudes, as the footnote to the statement explains:

In this case, (x - y)(x + y) has three rounding errors, but x2 - y2 has only two since the rounding error committed when computing the smaller of x2 and y2 does not affect the final subtraction.

In other words, taking x - y, x + y, and the product (x - y)(x + y) each introduce rounding errors (3 steps of rounding error). x2, y2, and the subtraction x2 - y2 also each introduce rounding errors, but the rounding error obtained by squaring a relatively small number (the smaller of x and y) is so negligible that there are effectively only two steps of rounding error, making the difference of squares more precise.

So option 1 is actually going to be more precise. This is confirmed by dev.brutus's Java test.

like image 74
1'' Avatar answered Oct 01 '22 15:10

1''