Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is catastrophic cancellation an issue when calculating dot products of floating point vectors? And if so, how is it typically addressed?

I am writing a physics simulator in C++ and am concerned about robustness. I've read that catastrophic cancellation can occur in floating point arithmetic when the difference of two numbers of almost equal magnitude is calculated. It occurred to me that this may happen in the simulator when the dot product of two almost orthogonal vectors is calculated. However, the references I have looked at only discuss solving the problem by rewriting the equation concerned (eg the quadratic formula can be rewritten to eliminate the problem) - but this doesn't seem to apply when calculating a dot product? I guess I'd be interested to know if this is typically an issue in physics engines and how it is addressed.

like image 774
Nick Kovac Avatar asked Jan 15 '12 10:01

Nick Kovac


People also ask

Are floating point errors deterministic?

The short answer is that FP calculations are entirely deterministic, as per the IEEE Floating Point Standard, but that doesn't mean they're entirely reproducible across machines, compilers, OS's, etc.

What is a floating point number in C?

A "floating-point constant" is a decimal number that represents a signed real number. The representation of a signed real number includes an integer portion, a fractional portion, and an exponent.


2 Answers

One common trick is to make the accumulator variable be a type with higher precision than the vectors itself.

Alternatively, one can use Kahan summation when summing the terms.

Another approach is to use various blocked dot product algorithms instead of the canonical algorithm.

One can of course combine both the above approaches.

Note that the above is wrt general error behavior for dot products, not specifically catastrophic cancellation.

like image 94
janneb Avatar answered Sep 23 '22 03:09

janneb


You say in a comment that you have to calculate x1*x2 + y1*y2, where all variables are floats. So if you do the calculation in double-precision, you lose no accuracy at all, because double-precision has more than twice as many bits of precision as float (assuming your target uses IEEE-754).

Specifically: let xx, yy be the real numbers represented by the float variables x, y. Let xxyy be their product, and let xy be the result of the double-precision multiplication x * y. Then in all cases, xxyy is the real number represented by xy.

like image 28
TonyK Avatar answered Sep 19 '22 03:09

TonyK