Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

On a float rounding error

I do not understand the output of the following program:

int main()
{
    float  x     = 14.567729f;
    float  sqr   = x * x;
    float  diff1 = sqr - x * x;
    double diff2 = double(sqr) - double(x) * double(x);
    std::cout << diff1 << std::endl;
    std::cout << diff2 << std::endl;
    return 0;
}

Output:

6.63225e-006
6.63225e-006

I use VS2010, x86 compiler.

I expect to get a different output

0
6.63225e-006

Why diff1 is not equal to 0? To calculate sqr - x * x compiler increases float precision to double. Why?

like image 616
Alexey Malistov Avatar asked Jun 22 '11 13:06

Alexey Malistov


1 Answers

The floating point registers are 80 bits (on most modern CPUs)

During an expression the result is an 80 bit value. It only gets truncated to 32 (float) or 64 (double) when it gets assigned to a location in memory. If you hold everything in registers (try compiling with -O3) you may see a different result.

Compiled with: -03:

> ./a.out
0
6.63225e-06
like image 144
Martin York Avatar answered Sep 29 '22 07:09

Martin York