Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any way to make sure the floating point arithmetic result the same in both linux and windows

My programe runs both in linux and windows, I have to make sure the floating point arithmetic get the same result in different OS.

Here is the code:

for (int i = 0; i < 100000; ++i)
{
    float d_value = 10.0f / float(i);
    float p_value = 0.01f * float(i) + 100.0f;
}

I use "g++ -m32 -c -static -g -O0 -ffloat-store" to build the code in linux. I use "/fp:precise /O2" to build the code in windows with vs2005.

When I printf the "d_value" and the "p_value", the "d_value" is all the same both in linux and windows. But the "p_value" is different sometimes. For exsample, print the "p_value" with hexadecimal format:

windows:  42d5d1eb
linux:    42d5d1ec

Why dose this happen?

My g++ version is

Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-8' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.4 --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-targets=all --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.4.5 (Debian 4.4.5-8)

I use the flag -ffloat-store, because of someone's suggestion here: Different math rounding behaviour between Linux, Mac OS X and Windows

like image 696
hdbean Avatar asked May 06 '13 09:05

hdbean


2 Answers

Use /fp:strict on Windows to tell the compiler to produce code that strictly follows IEEE 754, and gcc -msse2 -mfpmath=sse on Linux to obtain the same behavior there.

The reasons for the differences you are seeing have been discussed in spots on StackOverflow, but the best survey is David Monniaux's article.


The assembly instructions I obtain when compiling with gcc -msse2 -mpfmath=sse are as follow. Instructions cvtsi2ssq, divss, mulss, addss are the correct instructions to use, and they result in a program where p_value contains at one point 42d5d1ec.

    .globl  _main
    .align  4, 0x90
_main:                                  ## @main
    .cfi_startproc
## BB#0:
    pushq   %rbp
Ltmp2:
    .cfi_def_cfa_offset 16
Ltmp3:
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
Ltmp4:
    .cfi_def_cfa_register %rbp
    subq    $32, %rsp
    movl    $0, -4(%rbp)
    movl    $0, -8(%rbp)
LBB0_1:                                 ## =>This Inner Loop Header: Depth=1
    cmpl    $100000, -8(%rbp)       ## imm = 0x186A0
    jge LBB0_4
## BB#2:                                ##   in Loop: Header=BB0_1 Depth=1
    movq    _p_value@GOTPCREL(%rip), %rax
    movabsq $100, %rcx
    cvtsi2ssq   %rcx, %xmm0
    movss   LCPI0_0(%rip), %xmm1
    movabsq $10, %rcx
    cvtsi2ssq   %rcx, %xmm2
    cvtsi2ss    -8(%rbp), %xmm3
    divss   %xmm3, %xmm2
    movss   %xmm2, -12(%rbp)
    cvtsi2ss    -8(%rbp), %xmm2
    mulss   %xmm2, %xmm1
    addss   %xmm0, %xmm1
    movss   %xmm1, (%rax)
    movl    (%rax), %edx
    movl    %edx, -16(%rbp)
    leaq    L_.str(%rip), %rdi
    movl    -16(%rbp), %esi
    movb    $0, %al
    callq   _printf
    movl    %eax, -20(%rbp)         ## 4-byte Spill
## BB#3:                                ##   in Loop: Header=BB0_1 Depth=1
    movl    -8(%rbp), %eax
    addl    $1, %eax
    movl    %eax, -8(%rbp)
    jmp LBB0_1
LBB0_4:
    movl    -4(%rbp), %eax
    addq    $32, %rsp
    popq    %rbp
    ret
like image 170
Pascal Cuoq Avatar answered Nov 14 '22 21:11

Pascal Cuoq


The precise results of your code are not fully defined by the IEEE and C/C++ standards. That is the source of the problem.

The main problem is that while all of your inputs are floats that does not mean that the calculation must be done at float precision. The compiler can decide to use double-precision for all intermediate values if it wants to. This tends to happen automatically when compiling for x87 FPUs, but the compiler (VC++ 2010, for instance) can do this expansion explicitly if it wants to even when compiling SSE code.

This is not well understood. I shared my understanding of this a few years ago here:

http://randomascii.wordpress.com/2012/03/21/intermediate-floating-point-precision/

Some compilers let you specify the intermediate precision. If you can force all compilers to use the same intermediate precision then your results should be consistent.

like image 22
Bruce Dawson Avatar answered Nov 14 '22 21:11

Bruce Dawson