Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Round-twice error in .NET's Double.ToString method

Mathematically, consider for this question the rational number

8725724278030350 / 2**48

where ** in the denominator denotes exponentiation, i.e. the denominator is 2 to the 48th power. (The fraction is not in lowest terms, reducible by 2.) This number is exactly representable as a System.Double. Its decimal expansion is

31.0000000000000'49'73799150320701301097869873046875 (exact)

where the apostrophes do not represent missing digits but merely mark the boudaries where rounding to 15 resp. 17 digits is to be performed.

Note the following: If this number is rounded to 15 digits, the result will be 31 (followed by thirteen 0s) because the next digits (49...) begin with a 4 (meaning round down). But if the number is first rounded to 17 digits and then rounded to 15 digits, the result could be 31.0000000000001. This is because the first rounding rounds up by increasing the 49... digits to 50 (terminates) (next digits were 73...), and the second rounding might then round up again (when the midpoint-rounding rule says "round away from zero").

(There are many more numbers with the above characteristics, of course.)

Now, it turns out that .NET's standard string representation of this number is "31.0000000000001". The question: Isn't this a bug? By standard string representation we mean the String produced by the parameterles Double.ToString() instance method which is of course identical to what is produced by ToString("G").

An interesting thing to note is that if you cast the above number to System.Decimal then you get a decimal that is 31 exactly! See this Stack Overflow question for a discussion of the surprising fact that casting a Double to Decimal involves first rounding to 15 digits. This means that casting to Decimal makes a correct round to 15 digits, whereas calling ToSting() makes an incorrect one.

To sum up, we have a floating-point number that, when output to the user, is 31.0000000000001, but when converted to Decimal (where 29 digits are available), becomes 31 exactly. This is unfortunate.

Here's some C# code for you to verify the problem:

static void Main()
{
  const double evil = 31.0000000000000497;
  string exactString = DoubleConverter.ToExactString(evil); // Jon Skeet, http://csharpindepth.com/Articles/General/FloatingPoint.aspx 

  Console.WriteLine("Exact value (Jon Skeet): {0}", exactString);   // writes 31.00000000000004973799150320701301097869873046875
  Console.WriteLine("General format (G): {0}", evil);               // writes 31.0000000000001
  Console.WriteLine("Round-trip format (R): {0:R}", evil);          // writes 31.00000000000005

  Console.WriteLine();
  Console.WriteLine("Binary repr.: {0}", String.Join(", ", BitConverter.GetBytes(evil).Select(b => "0x" + b.ToString("X2"))));

  Console.WriteLine();
  decimal converted = (decimal)evil;
  Console.WriteLine("Decimal version: {0}", converted);             // writes 31
  decimal preciseDecimal = decimal.Parse(exactString, CultureInfo.InvariantCulture);
  Console.WriteLine("Better decimal: {0}", preciseDecimal);         // writes 31.000000000000049737991503207
}

The above code uses Skeet's ToExactString method. If you don't want to use his stuff (can be found through the URL), just delete the code lines above dependent on exactString. You can still see how the Double in question (evil) is rounded and cast.

ADDITION:

OK, so I tested some more numbers, and here's a table:

  exact value (truncated)       "R" format         "G" format     decimal cast
 -------------------------  ------------------  ----------------  ------------
 6.00000000000000'53'29...  6.0000000000000053  6.00000000000001  6
 9.00000000000000'53'29...  9.0000000000000053  9.00000000000001  9
 30.0000000000000'49'73...  30.00000000000005   30.0000000000001  30
 50.0000000000000'49'73...  50.00000000000005   50.0000000000001  50
 200.000000000000'51'15...  200.00000000000051  200.000000000001  200
 500.000000000000'51'15...  500.00000000000051  500.000000000001  500
 1020.00000000000'50'02...  1020.000000000005   1020.00000000001  1020
 2000.00000000000'50'02...  2000.000000000005   2000.00000000001  2000
 3000.00000000000'50'02...  3000.000000000005   3000.00000000001  3000
 9000.00000000000'54'56...  9000.0000000000055  9000.00000000001  9000
 20000.0000000000'50'93...  20000.000000000051  20000.0000000001  20000
 50000.0000000000'50'93...  50000.000000000051  50000.0000000001  50000
 500000.000000000'52'38...  500000.00000000052  500000.000000001  500000
 1020000.00000000'50'05...  1020000.000000005   1020000.00000001  1020000

The first column gives the exact (though truncated) value that the Double represent. The second column gives the string representation from the "R" format string. The third column gives the usual string representation. And finally the fourth column gives the System.Decimal that results from converting this Double.

We conclude the following:

  • Round to 15 digits by ToString() and round to 15 digits by conversion to Decimal disagree in very many cases
  • Conversion to Decimal also rounds incorrectly in many cases, and the errors in these cases cannot be described as "round-twice" errors
  • In my cases, ToString() seems to yield a bigger number than Decimal conversion when they disagree (no matter which of the two rounds correctly)

I only experimented with cases like the above. I haven't checked if there are rounding errors with numbers of other "forms".

like image 264
Jeppe Stig Nielsen Avatar asked Jun 18 '12 14:06

Jeppe Stig Nielsen


1 Answers

So from your experiments, it appears that Double.ToString doesn't do correct rounding.

That's rather unfortunate, but not particularly surprising: doing correct rounding for binary to decimal conversions is nontrivial, and also potentially quite slow, requiring multiprecision arithmetic in corner cases. See David Gay's dtoa.c code here for one example of what's involved in correctly-rounded double-to-string and string-to-double conversion. (Python currently uses a variant of this code for its float-to-string and string-to-float conversions.)

Even the current IEEE 754 standard for floating-point arithmetic recommends, but doesn't require that conversions from binary floating-point types to decimal strings are always correctly rounded. Here's a snippet, from section 5.12.2, "External decimal character sequences representing finite numbers".

There might be an implementation-defined limit on the number of significant digits that can be converted with correct rounding to and from supported binary formats. That limit, H, shall be such that H ≥ M+3 and it should be that H is unbounded.

Here M is defined as the maximum of Pmin(bf) over all supported binary formats bf, and since Pmin(float64) is defined as 17 and .NET supports the float64 format via the Double type, M should be at least 17 on .NET. In short, this means that if .NET were to follow the standard, it would be providing correctly rounded string conversions up to at least 20 significant digits. So it looks as though the .NET Double doesn't meet this standard.

In answer to the 'Is this a bug' question, much as I'd like it to be a bug, there really doesn't seem to be any claim of accuracy or IEEE 754 conformance anywhere that I can find in the number formatting documentation for .NET. So it might be considered undesirable, but I'd have a hard time calling it an actual bug.


EDIT: Jeppe Stig Nielsen points out that the System.Double page on MSDN states that

Double complies with the IEC 60559:1989 (IEEE 754) standard for binary floating-point arithmetic.

It's not clear to me exactly what this statement of compliance is supposed to cover, but even for the older 1985 version of IEEE 754, the string conversion described seems to violate the binary-to-decimal requirements of that standard.

Given that, I'll happily upgrade my assessment to 'possible bug'.

like image 159
Mark Dickinson Avatar answered Sep 28 '22 11:09

Mark Dickinson