This is mainly a followup to this other question, that was about a weird conversion from long to double and back again to long for big values. I already know that converting a float to an integral type does truncate, if that is the truncated value cannot be represented in target type, the behaviour is undefined: <blockquote> 4.9 Floating-integral conversions [conv.fpint] A prvalue of a floating point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type. </blockquote> But here is my code to demonstrate the problem, assuming a little endian architecture, where both long long and long double use 64 bits: <pre class="prettyprint"><code>#include <iostream> #include <iomanip> using namespace std; int main() { unsigned long long ull = 0xf000000000000000; long double d = static_cast<long double>(ull); // dump the IEE-754 number for a little endian system unsigned char * pt = reinterpret_cast<unsigned char *>(&d); for (int i = sizeof(d) -1; i>= 0; i--) { cout << hex << setw(2) << setfill('0') << static_cast<unsigned int>(pt[i]); } cout << endl; unsigned long long ull2 = static_cast<unsigned long long>(d); cout << ull << endl << d << endl << ull2 << endl; return 0; } </code></pre> The output is (using MSVC 2008 32bits on a old XP 32 box): <pre class="prettyprint lang-none prettyprint-override"><code>43ee000000000000 f000000000000000 1.72938e+019 8000000000000000 </code></pre> Explainations for values: <ul> <li>0xf000000000000000 is 17293822569102704640 in decimal, so the conversion to double is correct.</li> <li>43ee000000000000 : mantissa part is e000000000000 adding the implied 1 it correctly represents 4 bits with <code>1</code> followed with <code>0</code> - exponent is 43e after removing the 3ff bias it gives a binary representation of 1.111 263 so the exact representation of 0xf000000000000000 or 17293822569102704640 (ref)</li> </ul> As that value can be represented as an unsigned long long, I expected that its conversion to an unsigned long long gives original value, and MSVC gives 0x8000000000000000 or 9223372036854775808 The question is: is that conversion caused by undefined behaviour as suggested by the accepted answer to the other question or is it really a MSVC bug? (Note: same code on CLang compiler on a FreeBSD 10.1 box gives correct results) For references, I could find the generated code: <pre class="prettyprint"><code> unsigned long long ull2 = static_cast<unsigned long long>(d); 0041159E fld qword ptr [d] 004115A1 call @ILT+490(__ftol2) (4111EFh) 004115A6 mov dword ptr [ull2],eax 004115A9 mov dword ptr [ebp-40h],edx </code></pre> And the code for _ftol2 seems to be (got from debugger at execution time): <pre class="prettyprint"><code>00411C66 push ebp 00411C67 mov ebp,esp 00411C69 sub esp,20h 00411C6C and esp,0FFFFFFF0h 00411C6F fld st(0) 00411C71 fst dword ptr [esp+18h] 00411C75 fistp qword ptr [esp+10h] 00411C79 fild qword ptr [esp+10h] 00411C7D mov edx,dword ptr [esp+18h] 00411C81 mov eax,dword ptr [esp+10h] 00411C85 test eax,eax 00411C87 je integer_QnaN_or_zero (411CC5h) 00411C89 fsubp st(1),st 00411C8B test edx,edx 00411C8D jns positive (411CADh) 00411C8F fstp dword ptr [esp] 00411C92 mov ecx,dword ptr [esp] 00411C95 xor ecx,80000000h 00411C9B add ecx,7FFFFFFFh 00411CA1 adc eax,0 00411CA4 mov edx,dword ptr [esp+14h] 00411CA8 adc edx,0 00411CAB jmp localexit (411CD9h) 00411CAD fstp dword ptr [esp] 00411CB0 mov ecx,dword ptr [esp] 00411CB3 add ecx,7FFFFFFFh 00411CB9 sbb eax,0 00411CBC mov edx,dword ptr [esp+14h] 00411CC0 sbb edx,0 00411CC3 jmp localexit (411CD9h) 00411CC5 mov edx,dword ptr [esp+14h] 00411CC9 test edx,7FFFFFFFh 00411CCF jne arg_is_not_integer_QnaN (411C89h) 00411CD1 fstp dword ptr [esp+18h] 00411CD5 fstp dword ptr [esp+18h] 00411CD9 leave 00411CDA ret </code></pre>

This is mainly a compilation of comments to question. It appears that old MSVC versions used to incorrectly process conversions of 64 bits integers to 64 bits double precision number. The bug in present in versions below 2008. MSCV 2010 is wrong using 32 bits mode and correct in 64 bits mode All versions starting with 2012 are correct.

Incorrect double to long conversion

Tags:

c++

visual-c++

This is mainly a followup to this other question, that was about a weird conversion from long to double and back again to long for big values.

I already know that converting a float to an integral type does truncate, if that is the truncated value cannot be represented in target type, the behaviour is undefined:

4.9 Floating-integral conversions [conv.fpint]

A prvalue of a floating point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.

But here is my code to demonstrate the problem, assuming a little endian architecture, where both long long and long double use 64 bits:

#include <iostream>
#include <iomanip>

using namespace std;

int main()
{
  unsigned long long ull = 0xf000000000000000;
  long double d = static_cast<long double>(ull);
  // dump the IEE-754 number for a little endian system
  unsigned char * pt = reinterpret_cast<unsigned char *>(&d);
  for (int i = sizeof(d) -1; i>= 0; i--) {
      cout << hex << setw(2) << setfill('0') << static_cast<unsigned int>(pt[i]); 
  }
  cout << endl;
  unsigned long long ull2 = static_cast<unsigned long long>(d);
  cout << ull << endl << d << endl << ull2 << endl;
  return 0;
}

The output is (using MSVC 2008 32bits on a old XP 32 box):

43ee000000000000
f000000000000000
1.72938e+019
8000000000000000

Explainations for values:

0xf000000000000000 is 17293822569102704640 in decimal, so the conversion to double is correct.
43ee000000000000 : mantissa part is e000000000000 adding the implied 1 it correctly represents 4 bits with 1 followed with 0 - exponent is 43e after removing the 3ff bias it gives a binary representation of 1.111 2⁶³ so the exact representation of 0xf000000000000000 or 17293822569102704640 (ref)

As that value can be represented as an unsigned long long, I expected that its conversion to an unsigned long long gives original value, and MSVC gives 0x8000000000000000 or 9223372036854775808

The question is: is that conversion caused by undefined behaviour as suggested by the accepted answer to the other question or is it really a MSVC bug?

(Note: same code on CLang compiler on a FreeBSD 10.1 box gives correct results)

For references, I could find the generated code:

  unsigned long long ull2 = static_cast<unsigned long long>(d);
0041159E  fld         qword ptr [d] 
004115A1  call        @ILT+490(__ftol2) (4111EFh) 
004115A6  mov         dword ptr [ull2],eax 
004115A9  mov         dword ptr [ebp-40h],edx

And the code for _ftol2 seems to be (got from debugger at execution time):

00411C66  push        ebp  
00411C67  mov         ebp,esp 
00411C69  sub         esp,20h 
00411C6C  and         esp,0FFFFFFF0h 
00411C6F  fld         st(0) 
00411C71  fst         dword ptr [esp+18h] 
00411C75  fistp       qword ptr [esp+10h] 
00411C79  fild        qword ptr [esp+10h] 
00411C7D  mov         edx,dword ptr [esp+18h] 
00411C81  mov         eax,dword ptr [esp+10h] 
00411C85  test        eax,eax 
00411C87  je          integer_QnaN_or_zero (411CC5h) 
00411C89  fsubp       st(1),st 
00411C8B  test        edx,edx 
00411C8D  jns         positive (411CADh) 
00411C8F  fstp        dword ptr [esp] 
00411C92  mov         ecx,dword ptr [esp] 
00411C95  xor         ecx,80000000h 
00411C9B  add         ecx,7FFFFFFFh 
00411CA1  adc         eax,0 
00411CA4  mov         edx,dword ptr [esp+14h] 
00411CA8  adc         edx,0 
00411CAB  jmp         localexit (411CD9h) 
00411CAD  fstp        dword ptr [esp] 
00411CB0  mov         ecx,dword ptr [esp] 
00411CB3  add         ecx,7FFFFFFFh 
00411CB9  sbb         eax,0 
00411CBC  mov         edx,dword ptr [esp+14h] 
00411CC0  sbb         edx,0 
00411CC3  jmp         localexit (411CD9h) 
00411CC5  mov         edx,dword ptr [esp+14h] 
00411CC9  test        edx,7FFFFFFFh 
00411CCF  jne         arg_is_not_integer_QnaN (411C89h) 
00411CD1  fstp        dword ptr [esp+18h] 
00411CD5  fstp        dword ptr [esp+18h] 
00411CD9  leave            
00411CDA  ret

920

asked Nov 20 '15 14:11

Serge Ballesta

1 Answers

This is mainly a compilation of comments to question.

It appears that old MSVC versions used to incorrectly process conversions of 64 bits integers to 64 bits double precision number.

The bug in present in versions below 2008.

MSCV 2010 is wrong using 32 bits mode and correct in 64 bits mode

All versions starting with 2012 are correct.

109

answered Oct 23 '22 20:10

Serge Ballesta

Related questions
                            
                                Why do my threads sometimes "stutter"?
                            
                                Possible compiler bug in MSVC++
                            
                                convert c++ header file to protobuf .proto file
                            
                                try {.... } catch(..) only if a certain compile time expression is true
                            
                                Why the requirement for custom allocators to be copyconstructible?
                            
                                OpenCV find the text Scale from a size
                            
                                How to get the definition of a macro as a string literal?
                            
                                Compare and swap in C++
                            
                                Unexpected snprintf behaviour
                            
                                compile win32 library from exprtk
                            
                                Steam for Linux platform libraries causing Qt application misbehavior
                            
                                Why Does This Auto-Vectorizer Care About Constructors/Destructors?
                            
                                Static build of Qt Qt5Network linking error
                            
                                Custom Title Bar Color for native C++ app on Windows 10
                            
                                Enforce "noexcept" on std::function?
                            
                                Why is this value printed although being NaN?
                            
                                Detect if struct has padding
                            
                                Forwarding cv-ref-qualifier for member functions
                            
                                Developing a custom virtual keyboard for Windows 10
                            
                                How to navigate to source code in linked libraries in Clion?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Incorrect double to long conversion

Tags:

c++

visual-c++

Serge Ballesta

People also ask

1 Answers

Serge Ballesta

Recent Activity

Donate For Us