Numerical precision for difference of squares

Tags:

in my code I often compute things like the following piece (here C code for simplicity):

float cos_theta = /* some simple operations; no cosf call! */;
float sin_theta = sqrtf(1.0f - cos_theta * cos_theta); // Option 1

For this example ignore that the argument of the square root might be negative due to imprecisions. I fixed that with additional fdimf call. However, I wondered if the following is more precise:

Click to copy

float sin_theta = sqrtf((1.0f + cos_theta) * (1.0f - cos_theta)); // Option 2

cos_theta is between -1 and +1 so for each choice there will be situations where I subtract similar numbers and thus will loose precision, right? What is the most precise and why?

845

asked Aug 13 '13 15:08

cschwan

1 Answers

The most precise way with floats is likely to compute both sin and cos using a single x87 instruction, fsincos.

However, if you need to do the computation manually, it's best to group arguments with similar magnitudes. This means the second option is more precise, especially when cos_theta is close to 0, where precision matters the most.

As the article What Every Computer Scientist Should Know About Floating-Point Arithmetic notes:

The expression x² - y² is another formula that exhibits catastrophic cancellation. It is more accurate to evaluate it as (x - y)(x + y).

Edit: it's more complicated than this. Although the above is generally true, (x - y)(x + y) is slightly less accurate when x and y are of very different magnitudes, as the footnote to the statement explains:

In this case, (x - y)(x + y) has three rounding errors, but x² - y² has only two since the rounding error committed when computing the smaller of x² and y² does not affect the final subtraction.

In other words, taking x - y, x + y, and the product (x - y)(x + y) each introduce rounding errors (3 steps of rounding error). x², y², and the subtraction x² - y² also each introduce rounding errors, but the rounding error obtained by squaring a relatively small number (the smaller of x and y) is so negligible that there are effectively only two steps of rounding error, making the difference of squares more precise.

So option 1 is actually going to be more precise. This is confirmed by dev.brutus's Java test.

answered Oct 01 '22 15:10

1''

Related questions
                            
                                Class member function pointer
                            
                                Why does Visual Studio generate these additional files?
                            
                                Comparing two graphs
                            
                                c++ 11 std::atomic_flag, am I using this correctly?
                            
                                GLuint not being recognised
                            
                                C++ Expression must have pointer-to-object type
                            
                                How to open file in exclusive mode in C++
                            
                                c# array declaration syntax vs c++ array declaration syntax
                            
                                C linkage for function pointer passed to C library
                            
                                function template overloading: const* vs. const&
                            
                                Understanding of Guru of the Week #67: Double or Nothing
                            
                                How do I create a clean cascading if structure in c++?
                            
                                Template specialization when parameter values are equal
                            
                                An extern variable located in a function?
                            
                                Thread-safe lock-free array
                            
                                Ignoring duplicate explicit instantiations of template classes in C++
                            
                                What algorithm opencv GCGRAPH (max flow) is based on?
                            
                                Why does in_avail() output zero even if the stream has some char?
                            
                                Regular Expression causing Stack Overflow
                            
                                Convert string time to UNIX timestamp

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Numerical precision for difference of squares

Tags:

c++

c

numerical-analysis

cschwan

People also ask

1 Answers

1''

Recent Activity

Donate For Us