Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maximum number of decimal digits that can affect a double

Consider decimal representations of the form d1.d2d3d4d5...dnExxx where xxx is an arbitrary exponent and both d1 and dn are nonzero.

Is the maximum n known such that there exists a decimal representation d1.d2d3d4d5...dnExxx such that the interval (d1.d2d3d4d5...dnExxx, d1.d2d3d4d5...((dn)+1)Exxx) contains an IEEE 754 double?

n should be at least 17. The question is how much above 17.

This number n has something to do with the number of digits that it is enough to consider in a decimal-to-double conversion such as strtod(). I looked at the source code for David M. Gay's implementation hoping to find an answer there. There is an allusion to “40” but it is unclear whether this is a consequence of a sound mathematical result or just a statistically safe bound. Also the comment about “truncating” makes it sound like 0.5000000000000000000000000000000000000000000000000001 may be converted to 0.5 in round-upwards mode.

Musl's implementation seems to read approximately 125*9 digits, which is a lot. Then it switches to “sticky” mode:

if (c!='0') x[KMAX-4] |= 1;

Finally, how does the answer change when substituting “contains an IEEE 754 double” with “contains the midpoint of two consecutive IEEE 754 doubles”?

like image 343
Pascal Cuoq Avatar asked Jun 21 '13 21:06

Pascal Cuoq


1 Answers

When you have a subnormal number with odd significand, that is, an odd multiple of 2^(-1074), you have a number whose last nonzero digit in the decimal representation is the 1074th after the decimal point. You then have around 300 zeros directly following the decimal point, so the number has around 750-770 significant decimal digits. The smallest positive subnormal, 2^(-1074) has 751 significant digits, and the largest positive subnormal, (2^52-1)*2^(-1074) has 767 significant digits (I think that is the maximum).

So there is at least one sequence d1, ..., d766 of decimal digits such that there is an IEEE754 double in the open interval

(d1.d2...d766E-308, d1.d2...(d766 + 1)E-308)

The answer does not change much if we consider "contains the midpoint of two consecutive IEEE754 doubles", since subnormal doubles have all roughly the same amount of significant decimal digits, and the midpoint of two consecutive such too.

In the worst case, the entire digit sequence must be consumed (consider "0.5000000...0001" with arbitrarily many zeros before the final 1 that determines that the result shall be 0.5 + 0.5^53 and not 0.5 when rounding away from zero or up).

However, there are only

floor(DBL_MANT_DIG * log 2 / log 10) + 2 = 17

significant decimal digits necessary to distinguish between different double values, so a relatively easy, albeit probably not very efficient, method of parsing would be to parse the first (at least 17) digits (and the exponent) to the closest double, and compare the input string with the exact representation of that double value (and its neighbour).

like image 74
Daniel Fischer Avatar answered Nov 03 '22 10:11

Daniel Fischer