Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Output of strtoull() loses precision when converted to double and then back to uint64_t

Consider the following:

#include <iostream>
#include <cstdint>

int main() {
   std::cout << std::hex
      << "0x" << std::strtoull("0xFFFFFFFFFFFFFFFF",0,16) << std::endl
      << "0x" << uint64_t(double(std::strtoull("0xFFFFFFFFFFFFFFFF",0,16))) << std::endl
      << "0x" << uint64_t(double(uint64_t(0xFFFFFFFFFFFFFFFF))) << std::endl;
   return 0;
}

Which prints:

0xffffffffffffffff
0x0
0xffffffffffffffff

The first number is just the result of converting ULLONG_MAX, from a string to a uint64_t, which works as expected.

However, if I cast the result to double and then back to uint64_t, then it prints 0, the second number.

Normally, I would attribute this to the precision inaccuracy of floats, but what further puzzles me, is that if I cast the ULLONG_MAX from uint64_t to double and then back to uint64_t, the result is correct (third number).

Why the discrepancy between the second and the third result?

EDIT (by @Radoslaw Cybulski) For another what-is-going-on-here try this code:

#include <iostream>
#include <cstdint>
using namespace std;

int main() {
    uint64_t z1 = std::strtoull("0xFFFFFFFFFFFFFFFF",0,16);
    uint64_t z2 = 0xFFFFFFFFFFFFFFFFull;
    std::cout << z1 << " " << uint64_t(double(z1)) << "\n";
    std::cout << z2 << " " << uint64_t(double(z2)) << "\n";
    return 0;
}

which happily prints:

18446744073709551615 0
18446744073709551615 18446744073709551615
like image 966
Adama Avatar asked Jul 19 '19 13:07

Adama


1 Answers

The number that is closest to 0xFFFFFFFFFFFFFFFF, and is representable by double (assuming 64 bit IEEE) is 18446744073709551616. You'll find that this is a bigger number than 0xFFFFFFFFFFFFFFFF. As such, the number is outside the representable range of uint64_t.

Of the conversion back to integer, the standard says (quoting latest draft):

[conv.fpint]

A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.


Why the discrepancy between the second and the third result?

Because the behaviour of the program is undefined.

Although it is mostly pointless to analyse reasons for differences in UB because the scope of variation is limitless, my guess at the reason for the discrepancy in this case is that in one case the value is compile time constant, while in the other there is a call to a library function that is invoked at runtime.

like image 109
eerorika Avatar answered Nov 04 '22 19:11

eerorika