Of course most languages have library functions for this, but suppose I want to do it myself. Suppose that the float is given like in a C or Java program (except for the 'f' or 'd' suffix), for example "<code>4.2e1</code>", "<code>.42e2</code>" or simply "<code>42</code>". In general, we have the "integer part" before the decimal point, the "fractional part" after the decimal point, and the "exponent". All three are integers. It is easy to find and process the individual digits, but how do you compose them into a value of type <code>float</code> or <code>double</code> without losing precision? I'm thinking of multiplying the integer part with 10^n, where n is the number of digits in the fractional part, and then adding the fractional part to the integer part and subtracting n from the exponent. This effectively turns <code>4.2e1</code> into <code>42e0</code>, for example. Then I could use the <code>pow</code> function to compute 10^exponent and multiply the result with the new integer part. The question is, does this method guarantee maximum precision throughout? Any thoughts on this?

The "standard" algorithm for converting a decimal number to the best floating-point approximation is William Clinger's How to read floating point numbers accurately, downloadable from here. Note that doing this correctly requires multiple-precision integers, at least a certain percentage of the time, in order to handle corner cases. Algorithms for going the other way, printing the best decimal number from a floating-number, are found in Burger and Dybvig's Printing Floating-Point Numbers Quickly and Accurately, downloadable here. This also requires multiple-precision integer arithmetic See also David M Gay's Correctly Rounded Binary-Decimal and Decimal-Binary Conversions for algorithms going both ways.

How to manually parse a floating point number from a string

Tags:

floating-point

parsing

precision

Of course most languages have library functions for this, but suppose I want to do it myself.

Suppose that the float is given like in a C or Java program (except for the 'f' or 'd' suffix), for example "4.2e1", ".42e2" or simply "42". In general, we have the "integer part" before the decimal point, the "fractional part" after the decimal point, and the "exponent". All three are integers.

It is easy to find and process the individual digits, but how do you compose them into a value of type float or double without losing precision?

I'm thinking of multiplying the integer part with 10^n, where n is the number of digits in the fractional part, and then adding the fractional part to the integer part and subtracting n from the exponent. This effectively turns 4.2e1 into 42e0, for example. Then I could use the pow function to compute 10^exponent and multiply the result with the new integer part. The question is, does this method guarantee maximum precision throughout?

Any thoughts on this?

761

asked Sep 17 '08 16:09

Thomas

2 Answers

All of the other answers have missed how hard it is to do this properly. You can do a first cut approach at this which is accurate to a certain extent, but until you take into account IEEE rounding modes (et al), you will never have the right answer. I've written naive implementations before with a rather large amount of error.

If you're not scared of math, I highly recommend reading the following article by David Goldberg, What Every Computer Scientist Should Know About Floating-Point Arithmetic. You'll get a better understanding for what is going on under the hood, and why the bits are laid out as such.

My best advice is to start with a working atoi implementation, and move out from there. You'll rapidly find you're missing things, but a few looks at strtod's source and you'll be on the right path (which is a long, long path). Eventually you'll praise insert diety here that there are standard libraries.

/* use this to start your atof implementation */  /* atoi - [email protected] */ /* PUBLIC DOMAIN */ long atoi(const char *value) {   unsigned long ival = 0, c, n = 1, i = 0, oval;   for( ; c = value[i]; ++i) /* chomp leading spaces */     if(!isspace(c)) break;   if(c == '-' || c == '+') { /* chomp sign */     n = (c != '-' ? n : -1);     i++;   }   while(c = value[i++]) { /* parse number */     if(!isdigit(c)) return 0;     ival = (ival * 10) + (c - '0'); /* mult/accum */     if((n > 0 && ival > LONG_MAX)     || (n < 0 && ival > (LONG_MAX + 1UL))) {       /* report overflow/underflow */       errno = ERANGE;       return (n > 0 ? LONG_MAX : LONG_MIN);     }   }   return (n>0 ? (long)ival : -(long)ival); }

138

answered Nov 10 '22 15:11

user7116

The "standard" algorithm for converting a decimal number to the best floating-point approximation is William Clinger's How to read floating point numbers accurately, downloadable from here. Note that doing this correctly requires multiple-precision integers, at least a certain percentage of the time, in order to handle corner cases.

Algorithms for going the other way, printing the best decimal number from a floating-number, are found in Burger and Dybvig's Printing Floating-Point Numbers Quickly and Accurately, downloadable here. This also requires multiple-precision integer arithmetic

See also David M Gay's Correctly Rounded Binary-Decimal and Decimal-Binary Conversions for algorithms going both ways.

answered Nov 10 '22 15:11

Peter S. Housel

Related questions
                            
                                Float to String format specifier
                            
                                reading two integers in one line using C#
                            
                                What is the best way to get the list of column names using CsvHelper?
                            
                                easiest way to parse JSON in Qt 4.7
                            
                                Parsing Performance (If, TryParse, Try-Catch)
                            
                                Get and Parse CSV file in android
                            
                                Why double.TryParse("0.0000", out doubleValue) returns false ?
                            
                                Reading JSON file with Python 3
                            
                                Perl compatible regular expression (PCRE) in Python
                            
                                gson: Treat null as empty String
                            
                                How can I parse the IO String in Haskell?
                            
                                Extract filename and path from URL in bash script
                            
                                Complex number arithmetic in Tcl?
                            
                                Tutorials for writing a parser with Javascript [closed]
                            
                                Why does "new Date().toString()" work given Javascript operator precedence?
                            
                                Is it a Lexer's Job to Parse Numbers and Strings?
                            
                                "Smart" way of parsing and using website data?
                            
                                How best to parse a simple grammar?
                            
                                In Scala, how to read a simple CSV file having a header in its first line?
                            
                                Looking for a CSS Parser in java [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With