Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Whence the 27 digits of extra precision in `decimal.Decimal(1.0/3.0)`?

This post is about the number of significant digits in the expression decimal.Decimal(1.0/3.0).

The documentation for decimal.Decimal says that "[t]he significance of a new Decimal is determined solely by the number of digits input".

From this it follows, I think, that the number of significant digits in decimal.Decimal(1.0/3.0) should be determined by the number of significant digits in the IEEE 754 double resulting from the operation 1.0/3.0.

Now, it is my understanding that an IEEE 754 64-bit double has a "15-17 significant decimal digits precision".

Therefore, taking all of the above together, I expect that decimal.Decimal(1.0/3.0) will have at most 17 significant decimal digits.

It appears however, that decimal.Decimal(1.0/3.0) is given at least 54 significant decimal digits:

import decimal
print decimal.Decimal(1.0/3.0)

# 0.333333333333333314829616256247390992939472198486328125

Two key questions boil down from all this:

  1. what is the basis for the claim that an IEEE 754 double has "a 15-17 significant decimal digits precision"?
  2. how to resolve the contradiction among the following items?:
    1. the documentation for decimal.Decimal cited above
    2. the 54 (or more) significant digits in decimal.Decimal(1.0/3.0)
    3. the maximum of 17 for the significant decimal digits in an IEEE 754 double.

Addendum: OK, I now have a better understanding of the situation, thanks ajcr's answer, and several additional comments.

Internally, decimal represents 1.0/3.0 as the fraction

6004799503160661/18014398509481984

The denominator of this fraction is 254. The numerator is (254 - 1)/3, exactly.

The decimal representation of this fraction is, exactly

0.333333333333333314829616256247390992939472198486328125

Addendum 2: One more try. A floating point number F is a stand-in for an uncountable set of real numbers. This set of values includes the rational number Q(F) that is exactly represented by the floating point number F. It also includes uncountably many real values above and below Q(F). Now, given a real number R within the range of 64-bit IEEE 754 double, let F(R) be the double that R gets mapped to when it is represented as a floating point number 1.

For example, if R = 1/3, then F(R) is the IEEE 754 double given by the following 64 bits:

0 01111111101 0101010101010101010101010101010101010101010101010101 = F(R)

...and Q(F(R)) is the fraction N/D, where D = 254 = 18014398509481984, and N = (254 - 1)/3 = 6004799503160661. In short:

6004799503160661/18014398509481984 = Q(F(R))

or, alternatively, as an exact decimal:

0.333333333333333314829616256247390992939472198486328125 = Q(F(R))

But the floating point F(R) stands in not only for R = 1/3 and Q(F(R)) = N/D, but also for all the real numbers in the range (A, B) 2, where A = (2​N - 1)/2​D, and B = (2​N + 1)/2​D. Below I show the exact decimal representations of A < Q(F(R)) < B, together with a 54-digit inexact representation of R = 1/3:

0.3333333333333332593184650249895639717578887939453125   = A
0.333333333333333314829616256247390992939472198486328125 = Q(F(R))
0.333333333333333333333333333333333333333333333333333333 ~ R
0.33333333333333337034076748750521801412105560302734375  = B

Now, here are the decimal representations of the same four numbers A, Q(F(R)), R, and B, but now rounded to 17 significant figures:

0.33333333333333326 ~ A
0.33333333333333331 ~ Q(F(R))
0.33333333333333333 ~ R
0.33333333333333337 ~ B

This should at least explain why IEEE 754 doubles are said to have "15-17 significant decimal digits precision". More explicitly, the decimal representations of any two real numbers represented by a given IEEE 754 double will agree at between 15 and 17 of their most significant figures.

OK, back to Q(F(R)). Yes, this is a rational number whose denominator is a power of 2, and therefore we can compute its decimal expansion exactly. The number of significant figures in this expansion is literally infinite. But this number's role here is strictly as the canonical representative of an uncountable set of real numbers, and all these numbers share at most 17 significant figures. Therefore, to use any more significant figures in the expansion of Q(F(R)) amounts to a misrepresentation of this set of real numbers. IOW, the least significant 27 figures in the decimal expansion of Q(F(R)) are, in this sense, extraneous, irrelevant, indeed not significant, with respect to the role of Q(F(R)) as a stand-in for all the numbers in (A, B), including R.

To put it differently, when acting in its role as representative of the interval (A, B), Q(F(R)) should be just

0.33333333333333331 ~ Q(F(R))

The rest of its decimal expansion is not germane to this role, and therefore, it should be kept out of view.

I realize that it may be too difficult to do design decimal any better than it is, given all the demands placed on it. IOW, the misrepresentation described above may be, practically speaking, unavoidable. At least, however, it should be clear documented, along with all the other more-or-less unavoidable misrepresentations connected with floating point numbers.


1 Yes, I am maintaining the distinction between the IEEE 754 double F(R) (a particular sequence of bits in memory), and the rational number Q(F(R)) (a mathematical entity), just to be absolutely clear.

2 I suppose it also includes one of the endpoints of this range, but this detail is not important here.

like image 871
kjo Avatar asked Dec 18 '22 22:12

kjo


1 Answers

When passed a float, Decimal uses the from_float constructor. This class method constructs a Decimal from a single Python float exactly; it doesn't know that how the float was calculated and that is might be considered by humans to be accurate only to a certain number of digits.

Instead, it determines the appropriate number of digits to take from the float by considering it as a ratio of two integers. This is at line 740:

n, d = abs(f).as_integer_ratio()
k = d.bit_length() - 1
result = _dec_from_triple(sign, str(n*5**k), -k)

This means for 1.0/3.0 we have the following:

>>> f = 1.0 / 3.0
>>> f.as_integer_ratio()
(6004799503160661, 18014398509481984)
>>> (18014398509481984).bit_length()
55

To construct the decimal, the sign, coefficient and exponent are calculated and passed to _dec_from_triple. In this case the coefficient is the string:

'333333333333333314829616256247390992939472198486328125'

and the exponents is -(55-1). This gives the decimal exactly 54 digits of accuracy after the decimal point, hence your observation.

like image 112
Alex Riley Avatar answered Dec 28 '22 23:12

Alex Riley