What are some good do-s and don't-s for floating point arithmetic (IEEE754 in case there's confusion) to ensure good numerical stability and high accuracy in your results?
I know a few like don't subtract quantities of similar magnitude, but I'm curious what other good rules are out there.
Arithmetic operations on floating point numbers consist of addition, subtraction, multiplication and division. The operations are done with algorithms similar to those used on sign magnitude integers (because of the similarity of representation) — example, only add numbers of the same sign.
Floating-point decimal values generally do not have an exact binary representation. This is a side effect of how the CPU represents floating point data. For this reason, you may experience some loss of precision, and some floating-point operations may produce unexpected results.
The number 0.1 in floating-point The finite representation of 1/10 is 0.0 0011 ‾ 0.0\overline{0011} 0.00011, but it can't be represented in floating-point because we can't deal with bars in floating-point. We can represent it only in fixed digits/bits using any data type.
In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base.
First, enter with the notion that floating point numbers do NOT necessarily follow the same rules as real numbers... once you have accepted this, you will understand most of the pitfalls.
Here's some rules/tips that I've always followed:
if (myFloat == 0)
(a + b) + c != a + (b + c)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With