I'm having some trouble understanding why some figures can't be represented with floating point number.
As we know, a normal float would have sign bit, exponent, and mantissa. Why can't, for example, 0.1 be represented accurately in this system; the way I think of it would be that you put 10 (1010 in bin) to mantissa and -2 to the exponent. As far as I know, both numbers can be accurately represented in the mantissa and exponent. So why can't we represent 0.1 accurately?
Unfortunately, most decimal fractions cannot be represented exactly as binary fractions. A consequence is that, in general, the decimal floating-point numbers you enter are only approximated by the binary floating-point numbers actually stored in the machine.
Floating-point decimal values generally do not have an exact binary representation. This is a side effect of how the CPU represents floating point data. For this reason, you may experience some loss of precision, and some floating-point operations may produce unexpected results.
Floating-point error mitigation is the minimization of errors caused by the fact that real numbers cannot, in general, be accurately represented in a fixed space. By definition, floating-point error cannot be eliminated, and, at best, can only be managed. Huberto M.
Because JavaScript uses the IEEE 754 standard for Math, it makes use of 64-bit floating numbers. This causes precision errors when doing floating point (decimal) calculations, in short, due to computers working in Base 2 while decimal is Base 10.
If your exponent is decimal (i.e. it represents 10^X), you can precisely represent 0.1 -- however, most floating point formats use binary exponents (i.e. they represent 2^X). Since there are no integers X
and Y
such that Y * (2 ^ X) = 0.1
, you cannot precisely represent 0.1 in most floating point formats.
Some languages have types with both exponents. In C#, for example, there is a data type aptly named decimal
which is a floating point format with a decimal exponent so it will support storing a number like 0.1, although it has other uncommon properties: The decimal
type can distinguish between 0.1
and 0.10
, and it is always true that x + 1 != x
for all values of x
.
For most common purposes, though, C# also has the float
and double
floating point types that cannot precisely store 0.1 because they use a binary exponent (as defined in IEEE-754). The binary floating point types use less storage, are faster because they are easier to implement, and have more operations defined on them. In general decimal
is only used for financial values where the exact representation of all decimal values is important and the storage, speed, and range of operations are not.
You must start reading What Every Computer Scientist Should Know About Floating-Point Arithmetic
Check out :
Each floating-point number in the IEEE 754 standard is, in effect, some integer multiplied by some integer power of two. E.g., 3 is represented by 3 * 20, 96 is represented by 3 * 23, and 3/16 is represented by 3 * 2-4.
There are no integers x and y such that .1 = x * 2y, therefore .1 cannot be exactly represented by a floating-point number. Proof: If .1 = x * 2y, then 10x = 2-y. 2-y is clearly positive, so x is positive. It is also an integer, so 10x is divisible by 10, so it is divisible by 5. Therefore 2-y is a power of two that is divisible by 5, which is clearly impossible.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With