I have found the mismatch in the result of some complex calculations. When i thoroughly observed the intermediate results, its the std::pow function which creates that mismatch. Below are the inputs/output.
long double dvalue = 2.7182818284589998;
long double dexp = -0.21074699576017999;
long double result = std::powl( dvalue, dexp);
64bit -> result = 0.80997896907296496 and 32bit -> result = 0.80997896907296507
I am using VS2008. I have tried with other variation of pow function which takes long double and return long double, but still see the same difference.
double pow( double base, double exponent );
long double powl( long double base, long double exponent );
I have read some info on this:
Intel x86 processors use 80-bit extended precision internally, whereas double is normally 64-bit wide.Different optimization levels affect how often floating point values from CPU get saved into memory and thus rounded from 80-bit precision to 64-bit precision. Alternatively, use the long double type, which is normally 80-bit wide on gcc to avoid rounding from 80-bit to 64-bit precision.
Could someone make me clearly understand the difference and ways to overcome this difference.
The correct answer (rounding to nearest) is 0.80997896907296507151841069571673870086669921875 and that is exactly what you got in the "32bit result", truncated as 0.80997896907296507. Your "64bit result" appears to be exactly the other 64-bit double value, rounded the wrong way from the correct result (and truncated as 0.80997896907296496 ).
m. On 64-bit Windows, portions of the registry entries are stored separately for 32-bit application and 64-bit applications and mapped into separate logical registry views using the registry redirector and registry reflection, because the 64-bit version of an application may use different registry keys and values than the 32-bit version.
What's probably happening is that the 32-bit build is using the 80-bit FPU registers to do the calculation and the 64-bit build is using the SIMD operations using 64-bit values, causing a slight discrepancy. Note that both answers agree to 14 decimal places, which is about the best you can hope for with 64-bit floating point values.
So when you are compiling on 32 bit long double = double, but on x64 long double is actually an 80 bit floating point, so the results are different. Thanks for contributing an answer to Stack Overflow!
What's probably happening is that the 32-bit build is using the 80-bit FPU registers to do the calculation and the 64-bit build is using the SIMD operations using 64-bit values, causing a slight discrepancy. Note that both answers agree to 14 decimal places, which is about the best you can hope for with 64-bit floating point values.
Visual C++ offers compiler options that let you say whether you prefer speed, consistency, or precision with regard to floating point operations. Using those options (e.g., /fp:strict
), you can probably get consistent values between the two builds if that's important to you.
Also note that VC++2008 is rather old. Newer versions have fixes for many bugs, including some related to floating point. (Popular implementations of strtod
in open source software have had bugs detected and fixed since 2008.) In addition to the precision difference between 80-bit and 64-bit operations, you may also be encountering parsing and display bugs. Nonetheless, floating point is hard, and bugs persist.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With