Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is required printf precision for a __float128 to not lose information?

I'm trying to printf a __float128 using libquadmath, eg:

quadmath_snprintf(s, sizeof(s), "%.30Qg", f);

With the following three constaints:

  1. The output must match the following production:

     number = [ minus ] int [ frac ] [ exp ]
    
     decimal-point = %x2E       ; .
    
     digit1-9 = %x31-39         ; 1-9
    
     e = %x65 / %x45            ; e E
    
     exp = e [ minus / plus ] 1*DIGIT
    
     frac = decimal-point 1*DIGIT
    
     int = zero / ( digit1-9 *DIGIT )
    
     minus = %x2D               ; -
    
     plus = %x2B                ; +
    
     zero = %x30                ; 0
    
  2. Given any input __float128 "i" that has been printfed to a string matching the above production "s" and and then "s" is scanfed back into a __float128 "j" - "i" must be bitwise identical to "j" - ie no information should be lost. For at least some values this is not possible (NaN, Infinity), what is the complete list of those values?

  3. There should be no other string satisfying the above two criteria, that is shorter than the candidate.

Is there a quadmath_snprintf format string that satisfies the above (1, 3 and 2 when possible)? If so what is it?

What are the values of __float128 that cannot be represented accurately enough to satisfy point 2 by the above production? (eg Nan, +/-Infinity, etc) How do I detect if a __float128 is holding one of these values?

like image 887
Andrew Tomazos Avatar asked Jan 28 '12 12:01

Andrew Tomazos


People also ask

What is the precision of format specifier?

The precision specifier indicates the desired number of digits after the decimal point. If the precision specifier is omitted, a default of six digits after the decimal point is used. The case of the format specifier indicates whether to prefix the exponent with an "E" or an "e".

What is %dn in printf?

No, %d is a format string, signifying decimal value. 'n' will be appended.


1 Answers

If you're on x86, then the GCC __float128 type is a software implementation of the IEEE 754-2008 binary128 format. The IEEE 754 standard requires that a binary -> char -> binary roundtrip recovers the original value if the character representation contains 36 significant (decimal) digits. Thus the format string %.36Qg ought to do it.

It is not required that a NaN roundtrip recover the original bitwise value.

As for your requirement #3, libquadmath does not contain code for this kind of "shortest representation" formatting, e.g. in the spirit of the Steele & White paper or the code by David Gay.

like image 137
janneb Avatar answered Nov 14 '22 23:11

janneb