Why does "dtoa.c" contain so much code?

Question

I'll be the first to admit that my overall knowledge of low level programming is a bit sparse. I understand many of the core concepts but I do not use them on a regular basis. That being said I was absolutely astounded at how much code was needed for dtoa.c.

For the past couple months I have been working on an ECMAScript implementation in C# and I've been slowing filling in the holes in my engine. Last night I started working on Number.prototype.toString which is described in section 15.7.4.2 of the ECMAScript specification (pdf). In section 9.8.1, NOTE 3 offers a link to dtoa.c but I was looking for a challenge so I waited to view it. The following is what I came up with.

private IDynamic ToString(Engine engine, Args args)
{
    var thisBinding = engine.Context.ThisBinding;
    if (!(thisBinding is NumberObject) && !(thisBinding is NumberPrimitive))
    {
        throw RuntimeError.TypeError("The current 'this' must be a number or a number object.");
    }

    var num = thisBinding.ToNumberPrimitive();

    if (double.IsNaN(num))
    {
        return new StringPrimitive("NaN");
    }
    else if (double.IsPositiveInfinity(num))
    {
        return new StringPrimitive("Infinity");
    }
    else if (double.IsNegativeInfinity(num))
    {
        return new StringPrimitive("-Infinity");
    }

    var radix = !args[0].IsUndefined ? args[0].ToNumberPrimitive().Value : 10D;

    if (radix < 2D || radix > 36D)
    {
        throw RuntimeError.RangeError("The parameter [radix] must be between 2 and 36.");
    }
    else if (radix == 10D)
    {
        return num.ToStringPrimitive();
    }

    var sb = new StringBuilder();
    var isNegative = false;

    if (num < 0D)
    {
        isNegative = true;
        num = -num;
    }

    var integralPart = Math.Truncate(num);
    var decimalPart = (double)((decimal)num.Value - (decimal)integralPart);
    var radixChars = RadixMap.GetArray((int)radix);

    if (integralPart == 0D)
    {
        sb.Append('0');
    }
    else
    {
        var integralTemp = integralPart;
        while (integralTemp > 0)
        {
            sb.Append(radixChars[(int)(integralTemp % radix)]);
            integralTemp = Math.Truncate(integralTemp / radix);
        }
    }

    var count = sb.Length - 1;
    for (int i = 0; i < count; i++)
    {
        var k = count - i;
        var swap = sb[i];
        sb[i] = sb[k];
        sb[k] = swap;
    }

    if (isNegative)
    {
        sb.Insert(0, '-');
    }

    if (decimalPart == 0D)
    {
        return new StringPrimitive(sb.ToString());
    }

    var runningValue = 0D;
    var decimalIndex = 1D;
    var decimalTemp = decimalPart;

    sb.Append('.');
    while (decimalIndex < 100 && decimalPart - runningValue > 1.0e-50)
    {
        var result = decimalTemp * radix;
        var integralResult = Math.Truncate(result);
        runningValue += integralResult / Math.Pow(radix, decimalIndex++);
        decimalTemp = result - integralResult;
        sb.Append(radixChars[(int)integralResult]);
    }

    return new StringPrimitive(sb.ToString());
}

Can anyone with more experience in low level programming explain why dtoa.c has roughly 40 times as much code? I just cannot imagine C# being that much more productive.

Rick Regan · Accepted Answer

dtoa.c contains two main functions: dtoa(), which converts a double to string, and strtod(), which converts a string to a double. It also contains a lot of support functions, most of which are for its own implementation of arbitrary-precision arithmetic. dtoa.c's claim to fame is getting these conversions right, and that can only be done, in general, with arbitrary-precision arithmetic. It also has code to round conversions correctly in four different rounding modes.

Your code only tries to implement the equivalent of dtoa(), and since it uses floating-point to do its conversions, will not always get them right. (Update: see my article http://www.exploringbinary.com/quick-and-dirty-floating-point-to-decimal-conversion/ for details.)

(I've written a lot about this on my blog, http://www.exploringbinary.com/ . Six of my last seven articles have been about strtod() conversions alone. Read through them to see how complicated it is to do correctly rounded conversions.)

Matthew Slattery · Answer

Producing good results for conversions between decimal and binary floating point representations is a rather difficult problem.

The major source of difficulty is that many decimal fractions, even simple ones, cannot be accurately expressed using binary floating point -- for example, 0.5 can (obviously), but 0.1 cannot. And, going the other way (from binary to decimal), you generally don't want the absolutely accurate result (for example, the accurate decimal value of the closest number to 0.1 which can be represented in an IEEE-754-compliant double is actually 0.1000000000000000055511151231257827021181583404541015625) so you normally want some rounding.

So, conversion often involves approximation. Good conversion routines guarantee to produce the closest possible approximation within particular (word size or number of digits) constraints. This is where most of the complexity comes from.

Take a look at the paper cited in comment at the top of the dtoa.c implementation, Clinger's How to Read Floating Point Numbers Accurately, for a flavour of the problem; and perhaps David M. Gay (the author)'s paper, Correctly Rounded Binary-Decimal and Decimal-Binary Conversions.

(Also, more generally: What Every Computer Scientist Should Know About Floating Point Arithmetic.)

dkackman · Answer

Based on a quick glance at it, a fair amount of the C version is dealing with multiple platforms and such as it looks like this file is meant to be generically usable across compilers (C & C++), bitnesses, floating point implementations, and platforms; with tons of #define configurability.

Why does "dtoa.c" contain so much code?

Tags:

javascript

c

c#

floating-point

ecma262

ChaosPandion

3 Answers

Rick Regan

Matthew Slattery

dkackman

Recent Activity

Donate For Us

Why does "dtoa.c" contain so much code?

Tags:

javascript

c

c#

floating-point

ecma262

ChaosPandion

3 Answers

Rick Regan

Matthew Slattery

dkackman

Related questions

Recent Activity

Donate For Us