Look at this program:
#include <iostream>
#include <cmath>
using namespace std;
typedef pair<int, int> coords;
double dist(coords a, coords b)
{
return sqrt((a.first - b.first) * (a.first - b.first) +
(a.second - b.second) * (a.second - b.second));
}
int main()
{
coords A = make_pair(1, 0);
coords B = make_pair(0, 1);
coords C = make_pair(-1, 0);
coords D = make_pair(0, -1);
cerr.precision(20);
cerr << dist(A, B) + dist(C, D) << endl;
cerr << dist(A, D) + dist(B, C) << endl;
if(dist(A, B) + dist(C, D) > dist(A, D) + dist(B, C))
{
cerr << "*" << endl;
}
return 0;
}
Function dist returns distance betweeen two points. A, B, C, D are corners of square.
It should be dist(A, B) == dist(B, C) == dist(C, D) == dist(D, A) == sqrt(2).
And dist(A, B) + dist(C, D) == dist(A, D) + dist(B, C) == 2 * sqrt(2)
I am using GNU/Linux, i586, GCC 4.8.2.
I compile this program and run:
$ g++ test.cpp ; ./a.out
2.8284271247461902909
2.8284271247461902909
*
We see, that program outputs equal values of dist(A, B) + dist(C, D) and dist(A, D) + dist(B, C), but condition dist(A, B) + dist(C, D) > dist(A, D) + dist(B, C) is true!
When I compile with -O2, its look OK:
$ g++ test.cpp -O2 ; ./a.out
2.8284271247461902909
2.8284271247461902909
I think, it is a gcc-bug, and it is not directly related to floating-point operation accuracy, because in this case values dist(A, B) + dist(C, D) and dist(A, D) + dist(B, C) MUST BE equal (each of them is sqrt(2) + sqrt(2)).
When I change function dist:
double dist(coords a, coords b)
{
double x = sqrt((a.first - b.first) * (a.first - b.first) + (a.second - b.second) * (a.second - b.second));
return x;
}
the program runs correct. So the problem not in floating-point representation of numbers, but in the gcc code.
Simplified example for 32-bit compiler:
#include <iostream>
#include <cmath>
using namespace std;
int main()
{
if (sqrt(2) != sqrt(2))
{
cout << "Unequal" << endl;
}
else
{
cout << "Equal" << endl;
}
return 0;
}
Run without optimization:
$ g++ test.cpp ; ./a.out
Unequal
Run with -ffloat-store:
$ g++ test.cpp -ffloat-store ; ./a.out
Equal
Probably, it is "not a bug" in GCC #323: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
Compiling with -ffloat-store solve the problem.
This seemingly strange behavior is due to the way the old x87 floating point unit works: it uses an 80-bit long double
type with 64 bits of precision as its register format, while temporary double
s are 64 bits long with 53 bits of precision. What happens is that the compiler spills 1 of the sqrt(2)
results to memory (as sqrt
returns a double
, this rounds to the 53-bit significand of that type) so that the FP register-stack is clear for the next call to sqrt(2)
. It then compares that loaded-from-memory 53-bits-of-precision value to the unrounded 64-bits-of-precision value coming back from the other sqrt(2)
call, and they come up different because they are rounded differently, as you can see from this assembler output (annotations mine, used your second code snippet with the 2s changed to 2.0s for clarity and -Wall -O0 -m32 -mfpmath=387 -march=i586 -fno-builtin
for compile flags on Godbolt):
main:
# Prologue
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $20, %esp
# Argument push (2.0)
subl $8, %esp
movl $0, %eax
movl $1073741824, %edx
pushl %edx
pushl %eax
# sqrt(2.0)
call sqrt
# Return value spill
addl $16, %esp
fstpl -16(%ebp)
# Argument push (2.0)
subl $8, %esp
movl $0, %eax
movl $1073741824, %edx
pushl %edx
pushl %eax
# sqrt(2.0)
call sqrt
addl $16, %esp
# Comparison -- see how one arg is loaded from a spill slot while the other is
# coming from the ST(0) i387 register
fldl -16(%ebp)
fucompp
fnstsw %ax
# Status word interpretation
andb $69, %ah
xorb $64, %ah
setne %al
testb %al, %al
# The branch was here, but it and the code below was snipped for brevity's sake
Verdict: the x87 is the weirdo. -mfpmath=sse
is the definitive fix for this behavior -- it'll make GCC define FLT_EVAL_METHOD
to 0, as the SSE(2) floating point support is single/double only. The-ffloat-store
switch works around it as well for this program, but it's not recommended as a general purpose workaround -- it'll make your program slower due to the extra spills/fills, and doesn't work in all cases. Of course, going to a 64bit CPU/OS/compiler combination fixes this too, because the x86-64 ABI uses SSE2 for floating-point math by default.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With