Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Floating-Point Arithmetic = What is the worst precision/difference from Dec to Binary?

as all know decimal fractions (like 0.1) , when stored as floating point (like double or float) will be internally represented in "binary format" (IEEE 754). And some decimal fractions can not directly be represented in binary format.

What I have not understood, is the precision of this "conversion":

1.) A Floating point itself can have a precision (that is the "significant")?

2.) But also the conversion from decimal fraction to binary fraction has a precision loss?

Question:

What is the worst case precision loss (for "all" possible decimal fractions) when converting from decimal fractions to floating point fractions?

(The reason I want to know this is, when comparing decimal fractions with binary/floating point fractions I need to take the precision into account...to determine if both figures are identical. And I want this precision to be as tight/precise as possible (decimal fraction == binary fraction +/- precision)

Example (only hypothetical)

0,1 dec => 0,10000001212121212121212 (binary fraction double) => precision loss 0,00000001212121212121212
0,3 dec => 0,300000282828282 (binary fraction double) => precision loss  0,000000282828282
like image 437
Markus Avatar asked Aug 24 '11 02:08

Markus


People also ask

What is the precision value of floating point?

According to this standard, floating point numbers are represented with 32 bits (single precision) or 64 bits (double precision).

Why is 0.1 0.2 === 0.3 false and how can you ensure precise decimal arithmetics?

For example, 0.1 and 0.2 cannot be represented precisely. Hence, 0.1 + 0.2 === 0.3 yields false. To really understand why 0.1 cannot be represented properly as a 32-bit floating-point number, you must understand binary. Representing many decimals in binary requires an infinite number of digits.

What is the highest and lowest precision for IEEE single precision floating points?

IEEE-754 Single Precision Instead of storing m , we store c=m+127 c = m + 127 . Thus, the largest possible exponent is 127, and the smallest possible exponent is -126.

Why do floating point numbers have limited precision?

Floating-point decimal values generally do not have an exact binary representation. This is a side effect of how the CPU represents floating point data. For this reason, you may experience some loss of precision, and some floating-point operations may produce unexpected results.


2 Answers

It is not entirely clear to me what you are after, but you may be interested in the following paper which discusses many of the accuracy issues involved in binary/decimal conversion, including lists of hard cases.

Vern Paxson and William Kahan. A program for testing IEEE decimal-binary conversion. May 22, 1991 http://www.icir.org/vern/papers/testbase-report.pdf

like image 113
njuffa Avatar answered Oct 05 '22 23:10

njuffa


Floating point will become more and more inaccurate the larger it gets (both in the positive and negative directions). This is because floating point values are an exponential format.

However, decimal will become more and more exact the more decimal places it uses, regardless of how large it is.

Therefore, the worst precision difference would be towards the numerical limits of whatever floating point type you're using.

like image 38
Arafangion Avatar answered Oct 06 '22 00:10

Arafangion