Do C#/.NET floating point operations differ in precision between debug mode and release mode?
Here's a simple example where results not only differ between debug and release mode, but the way by which they do so depend on whether one uses x86 or x84 as a platform:
Single f1 = 0.00000000002f;
Single f2 = 1 / f1;
Double d = f2;
Console.WriteLine(d);
This writes the following results:
Debug Release
x86 49999998976 50000000199,7901
x64 49999998976 49999998976
A quick look at the disassembly (Debug -> Windows -> Disassembly in Visual Studio) gives some hints about what's going on here. For the x86 case:
Debug Release
mov dword ptr [ebp-40h],2DAFEBFFh | mov dword ptr [ebp-4],2DAFEBFFh
fld dword ptr [ebp-40h] | fld dword ptr [ebp-4]
fld1 | fld1
fdivrp st(1),st | fdivrp st(1),st
fstp dword ptr [ebp-44h] |
fld dword ptr [ebp-44h] |
fstp qword ptr [ebp-4Ch] |
fld qword ptr [ebp-4Ch] |
sub esp,8 | sub esp,8
fstp qword ptr [esp] | fstp qword ptr [esp]
call 6B9783BC | call 6B9783BC
In particular, we see that a bunch of seemingly redundant "store the value from the floating point register in memory, then immediately load it back from memory into the floating point register" have been optimized away in release mode. However, the two instructions
fstp dword ptr [ebp-44h]
fld dword ptr [ebp-44h]
are enough to change the value in the x87 register from +5.0000000199790138e+0010 to +4.9999998976000000e+0010 as one may verify by stepping through the disassembly and investigating the values of the relevant registers (Debug -> Windows -> Registers, then right click and check "Floating point").
The story for x64 is wildly different. We still see the same optimization removing a few instructions, but this time around, everything relies on SSE with its 128-bit registers and dedicated instruction set:
Debug Release
vmovss xmm0,dword ptr [7FF7D0E104F8h] | vmovss xmm0,dword ptr [7FF7D0E304C8h]
vmovss dword ptr [rbp+34h],xmm0 | vmovss dword ptr [rbp-4],xmm0
vmovss xmm0,dword ptr [7FF7D0E104FCh] | vmovss xmm0,dword ptr [7FF7D0E304CCh]
vdivss xmm0,xmm0,dword ptr [rbp+34h] | vdivss xmm0,xmm0,dword ptr [rbp-4]
vmovss dword ptr [rbp+30h],xmm0 |
vcvtss2sd xmm0,xmm0,dword ptr [rbp+30h] | vcvtss2sd xmm0,xmm0,xmm0
vmovsd qword ptr [rbp+28h],xmm0 |
vmovsd xmm0,qword ptr [rbp+28h] |
call 00007FF81C9343F0 | call 00007FF81C9343F0
Here, because the SSE unit avoids using higher precision than single precision internally (while the x87 unit does), we end up with the "single precision-ish" result of the x86 case regardless of optimizations. Indeed, one finds (after enabling the SSE registers in the Visual Studio Registers overview) that after vdivss
, XMM0 contains 0000000000000000-00000000513A43B7 which is exactly the 49999998976 from before.
Both of the discrepancies bit me in practice. Besides illustrating that one should never compare equality of floating points, the example also shows that there's still room for assembly debugging in a high-level language such as C#, the moment floating points show up.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With