Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cast from unsigned long long to double and vice versa changes the value

When writing a C++ code I suddenly realised that my numbers are incorrectly casted from double to unsigned long long.

To be specific, I use the following code:

#define _CRT_SECURE_NO_WARNINGS

#include <iostream>
#include <limits>
using namespace std;

int main()
{
  unsigned long long ull = numeric_limits<unsigned long long>::max();
  double d = static_cast<double>(ull);
  unsigned long long ull2 = static_cast<unsigned long long>(d);
  cout << ull << endl << d << endl << ull2 << endl;
  return 0;
}

Ideone live example.

When this code is executed on my computer, I have the following output:

18446744073709551615
1.84467e+019
9223372036854775808
Press any key to continue . . .

I expected the first and third numbers to be exactly the same (just like on Ideone) because I was sure that long double took 10 bytes, and stored the mantissa in 8 of them. I would understand if the third number were truncated compared to first one - just for the case I'm wrong with the floating-point numbers format. But here the values are twice different!

So, the main question is: why? And how can I predict such situations?

Some details: I use Visual Studio 2013 on Windows 7, compile for x86, and sizeof(long double) == 8 for my system.

like image 1000
alexeykuzmin0 Avatar asked Nov 20 '15 10:11

alexeykuzmin0


2 Answers

18446744073709551615 is not exactly representible in double (in IEEE754). This is not unexpected, as a 64-bit floating point obviously cannot represent all integers that are representible in 64 bits.

According to the C++ Standard, it is implementation-defined whether the next-highest or next-lowest double value is used. Apparently on your system, it selects the next highest value, which seems to be 1.8446744073709552e19. You could confirm this by outputting the double with more digits of precision.

Note that this is larger than the original number.

When you convert this double to integer, the behaviour is covered by [conv.fpint]/1:

A prvalue of a floating point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.

So this code potentially causes undefined behaviour. When undefined behaviour has occurred, anything can happen, including (but not limited to) bogus output.


The question was originally posted with long double, rather than double. On my gcc, the long double case behaves correctly, but on OP's MSVC it gave the same error. This could be explained by gcc using 80-bit long double, but MSVC using 64-bit long double.

like image 63
M.M Avatar answered Oct 21 '22 09:10

M.M


It's due to double approximation to long long. Its precision means ~100 units error at 10^19; as you try to convert values around the upper limit of long long range, it overflows. Try to convert 10000 lower value instead :)

BTW, at Cygwin, the third printed value is zero

like image 32
AndreyS Scherbakov Avatar answered Oct 21 '22 08:10

AndreyS Scherbakov