Bizarre floating-point behavior with vs. without extra variables, why?

People also ask

What is the main problem with floating-point numbers?

The Problem Since real numbers cannot be represented accurately in a fixed space, when operating with floating-point numbers, the result might not be able to be fully represented with the required precision.

Why are floating point calculations so inaccurate?

The floating-point calculations are inaccurate because mainly the rationals are approximating that cannot be represented finitely in base 2 and in general they are approximating numbers which may not be representable in finitely many digits in any base.

What is pitfalls of floating-point representation?

Inexact The “real” result of a computation cannot be exactly represented by a floating-point number. The silent response is to round the number, which is a behaviour that the vast majority of programs using floating-point numbers rely upon. However, rounding has to be correctly taking into account for sound analysis.

Why are floating points important?

Floating-point numbers also offer greater precision. Precision measures the number of bits used to represent numbers. Precision can be used to estimate the impact of errors due to integer truncation and rounding. The precision of a floating-point number is determined by the mantissa.

When I run the following code in VC++ 2013 (32-bit, no optimizations):

#include <cmath>
#include <iostream>
#include <limits>

double mulpow10(double const value, int const pow10)
{
    static double const table[] =
    {
        1E+000, 1E+001, 1E+002, 1E+003, 1E+004, 1E+005, 1E+006, 1E+007,
        1E+008, 1E+009, 1E+010, 1E+011, 1E+012, 1E+013, 1E+014, 1E+015,
        1E+016, 1E+017, 1E+018, 1E+019,
    };
    return pow10 < 0 ? value / table[-pow10] : value * table[+pow10];
}

int main(void)
{
    double d = 9710908999.008999;
    int j_max = std::numeric_limits<double>::max_digits10;
    while (j_max > 0 && (
        static_cast<double>(
            static_cast<unsigned long long>(
                mulpow10(d, j_max))) != mulpow10(d, j_max)))
    {
        --j_max;
    }
    double x = std::floor(d * 1.0E9);
    unsigned long long y1 = x;
    unsigned long long y2 = std::floor(d * 1.0E9);
    std::cout
        << "x == " << x << std::endl
        << "y1 == " << y1 << std::endl
        << "y2 == " << y2 << std::endl;
}

I get

x  == 9.7109089990089994e+018
y1 == 9710908999008999424
y2 == 9223372036854775808

in the debugger.

I'm mindblown. Can someone please explain to me how the heck y1 and y2 have different values?

Update:

This only seems to happen under /Arch:SSE2 or /Arch:AVX, not /Arch:IA32 or /Arch:SSE.

Related questions
                            
                                Extending HTTP Handlers
                            
                                When using multiple classifiers - How to measure the ensemble's performance? [SciKit Learn]
                            
                                Error delivering iOS App update. "This bundle is invalid. Apple is not currently accepting applications built with this version of the SDK."
                            
                                How to plot overlapping ranges with ggplot2
                            
                                MVC4 DropDownListFor Object reference not set to an instance of an object
                            
                                Is there a Python library to create thumbnails for various document file formats?
                            
                                Why does postgresql prompt error 'perhaps out of disk space' while there is enough disk space?
                            
                                Replace "\'" with any other character with String's replace()
                            
                                "readable" event occurs twice
                            
                                Frost glass effect in android
                            
                                How to send extra values with Kendo File Upload
                            
                                How to Identify Delay signed assembly

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Bizarre floating-point behavior with vs. without extra variables, why?

Tags:

People also ask

Update:

Recent Activity

Donate For Us