In follow up to this question, it appears that some numbers cannot be represented by floating point at all, and instead are approximated. How are floating point numbers stored? Is there a common standard for the different sizes? What kind of gotchas do I need to watch out for if I use floating point? Are they cross-language compatible (ie, what conversions do I need to deal with to send a floating point number from a python program to a C program over TCP/IP)?

As mentioned, the Wikipedia article on IEEE 754 does a good job of showing how floating point numbers are stored on most systems. Now, here are some common gotchas: <ul> <li>The biggest is that you almost never want to compare two floating point numbers for equality (or inequality). You'll want to use greater than/less than comparisons instead.</li> <li>The more operations you do on a floating point number, the more significant rounding errors can become.</li> <li>Precision is limited by the size of the fraction, so you may not be able to correctly add numbers that are separated by several orders of magnitude. (For example, you won't be able to add 1E-30 to 1E30.)</li> </ul>

How is floating point stored? When does it matter?

2 Answers

As mentioned, the Wikipedia article on IEEE 754 does a good job of showing how floating point numbers are stored on most systems.

Now, here are some common gotchas:

The biggest is that you almost never want to compare two floating point numbers for equality (or inequality). You'll want to use greater than/less than comparisons instead.
The more operations you do on a floating point number, the more significant rounding errors can become.
Precision is limited by the size of the fraction, so you may not be able to correctly add numbers that are separated by several orders of magnitude. (For example, you won't be able to add 1E-30 to 1E30.)

answered Oct 07 '22 00:10

Rob Pilkington

The standard is IEEE 754.

Of course, there are other means to store numbers when IEE754 isn't good enough. Libraries like Java's BigDecimal are available for most platforms and map well to SQL's number type. Symbols can be used for irrational numbers, and ratios that can't be accurately represented in binary or decimal floating point can be stored as a ratio.

answered Oct 06 '22 22:10

erickson

Related questions
                            
                                Building a 32-bit float out of its 4 composite bytes
                            
                                The Double byte size in 32 bit and 64 bit OS
                            
                                Float in Database to ? in .NET
                            
                                Fastest way to zero out low values in array?
                            
                                Fast AVX512 modulo when same divisor
                            
                                Why is (1-x)(1+x) preferred to (1-x^2)?
                            
                                Evaluate if two doubles are equal based on a given precision, not within a certain fixed tolerance
                            
                                Math.pow yields different result depending on java version
                            
                                Julia: Convert numeric string to float or int
                            
                                Change type of Mat object from CV_32F to CV_8U
                            
                                Ranges of floating point datatype in C?
                            
                                What is the arithmetic mean of an empty sequence?
                            
                                C convert floating point to int
                            
                                Convert list or numpy array of single element to float in python
                            
                                Float and Int Both 4 Bytes? How Come?
                            
                                Testing floating point equality
                            
                                Is floating point arithmetic stable? [duplicate]
                            
                                Truncate Decimal number not Round Off [duplicate]
                            
                                How do I get the decimal places of a floating point number in Javascript?
                            
                                Extended (80-bit) double floating point in x87, not SSE2 - we don't miss it?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How is floating point stored? When does it matter?

Tags:

floating-point

rounding-error

Adam Davis

People also ask

2 Answers

Rob Pilkington

erickson

Recent Activity

Donate For Us