Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is floating point stored? When does it matter?

In follow up to this question, it appears that some numbers cannot be represented by floating point at all, and instead are approximated.

How are floating point numbers stored?

Is there a common standard for the different sizes?

What kind of gotchas do I need to watch out for if I use floating point?

Are they cross-language compatible (ie, what conversions do I need to deal with to send a floating point number from a python program to a C program over TCP/IP)?

like image 597
Adam Davis Avatar asked Sep 11 '08 15:09

Adam Davis


People also ask

How is floating point stored?

Scalars of type float are stored using four bytes (32-bits). The format used follows the IEEE-754 standard. The mantissa represents the actual binary digits of the floating-point number.

Why is floating point so important?

Floating point representation makes numerical computation much easier. You could write all your programs using integers or fixed-point representations, but this is tedious and error-prone.

Why floating point numbers Cannot be stored accurately?

Because often-times, they are approximating rationals that cannot be represented finitely in base 2 (the digits repeat), and in general they are approximating real (possibly irrational) numbers which may not be representable in finitely many digits in any base.

Where are floating point numbers stored?

This value is multiplied by the base 2 raised to the power of 2 to get 3.14159. Floating-point numbers are encoded by storing the significand and the exponent (along with a sign bit). Like signed integer types, the high-order bit indicates sign; 0 indicates a positive value, 1 indicates negative.


2 Answers

As mentioned, the Wikipedia article on IEEE 754 does a good job of showing how floating point numbers are stored on most systems.

Now, here are some common gotchas:

  • The biggest is that you almost never want to compare two floating point numbers for equality (or inequality). You'll want to use greater than/less than comparisons instead.
  • The more operations you do on a floating point number, the more significant rounding errors can become.
  • Precision is limited by the size of the fraction, so you may not be able to correctly add numbers that are separated by several orders of magnitude. (For example, you won't be able to add 1E-30 to 1E30.)
like image 94
Rob Pilkington Avatar answered Oct 07 '22 00:10

Rob Pilkington


The standard is IEEE 754.

Of course, there are other means to store numbers when IEE754 isn't good enough. Libraries like Java's BigDecimal are available for most platforms and map well to SQL's number type. Symbols can be used for irrational numbers, and ratios that can't be accurately represented in binary or decimal floating point can be stored as a ratio.

like image 28
erickson Avatar answered Oct 06 '22 22:10

erickson