Consider the following code:
// Filename fputest.cpp
#include <cmath>
#include <cstdio>
int main()
{
double x;
*(__int64 *) &x = 0xc01448ec3aaa278di64; // -5.0712136427263319
double sine1 = sin(x);
printf("%016llX\n", sine1);
double sine2;
__asm {
fld x
fsin
fstp sine2
}
printf("%016llX\n", sine2);
return 0;
}
When compiled with Visual C++ 2012 (cl fputest.cpp
) and the program is executed, the output is the following:
3FEDF640D8D36174
3FEDF640D8D36175
Questions:
This problem is not caused by conversion from long double to double. It may be due to inaccuracy in the sin
routine in the math library.
The fsin
instruction is specified to produce a result within 1 ULP (in the long double format) for operands within its range (per Intel 64 and IA-32 Architectures Software Developer's Manual, October 2011, volume 1, 8.3.10), in round-to-nearest mode. On an Intel Core i7, fsin
of the questioner’s value, −5.07121364272633190495298549649305641651153564453125 or -0x1.448ec3aaa278dp+2, produces 0xe.fb206c69b0ba402p-4. We can easily see from this hexadecimal that the last 11 bits are 100 0000 0010. Those are the bits that will be rounded when converting from long double. If they are greater than 100 0000 0000, the number will be rounded up. They are greater. Therefore, the result of converting this long double value to double is 0xe.fb206c69b0ba8p-4, which equals 0x1.df640d8d36175p-1 and 0.93631021832247418590355891865328885614871978759765625. Also note that even if the result were one ULP lower, the last 11 bits would still be greater than 100 0000 0000 and would still round up. Therefore this result should not vary on Intel CPUs conforming to the above documentation.
Compare this to computing a double-precision sine directly, using an ideal sin
routine that produces correctly rounded results. The sine of the value is approximately 0.93631021832247413051857150785044253634581268961333520518023697738674775240815140702992025520721336793516756640679315765619707343171517531053811196321335899848286682535203710849065933755262347468763562 (computed with Maple 10). The double closest to this is 0x1.df640d8d36175p-1. That is the same value we obtained by converting the fsin
result to double.
Therefore, the discrepancy is not caused by conversion of long double to double; converting the long double fsin
result to double produces exactly the same result as an ideal double-precision sin
routine.
We do not have a specification for the accuracy of the sin
routine used by the questioner’s Visual Studio package. In commercial libraries, allowing errors of 1 ULP or several ULP is common. Observe how close the sine is to a point where the double-precision value is rounded: It is .498864 ULP (double-precision ULP) away from a double, so it is .001136 ULP away from the point where rounding changes. Therefore, even a very slight inaccuracy in the sin
routine will cause it to return 0x1.df640d8d36174p-1 instead of the closer 0x1.df640d8d36175p-1.
Therefore, I conjecture the source of the discrepancy is a very small inaccuracy in the sin
routine.
(Note: As mentioned in the comments, this does not work on VC2012. I've left it here for general information. I wouldn't recommend relying on anything that depends on the optimization level anyway!)
I don't have VS2012, but on the VS2010 compiler you can specify /fp:fast
on the command line and then I get the same results. This causes the compiler to generate "fast" code that doesn't necessarily quite follow the required rounding and rules in C++ but which matches your assembly language calculation.
I can't try this in VS2012 but I imagine it has the same option.
This only seems to work in an optimized build too with /Ox
as an option.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With