Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Floating point C++ compiler options | preventing a/b -> a* (1/b)

I'm writing realtime numeric software, in C++, currently compiling it with Visual-C++ 2008. Now using 'fast' floating point model (/fp:fast), various optimizations, most of them useful my case, but specifically:

a/b -> a*(1/b) Division by multiplicative inverse

is too numerically unstable for a-lot of my calculations.

(see: Microsoft Visual C++ Floating-Point Optimization)

Switching to /fp:precise makes my application run more than twice as slow. Is is possible to either fine-tune the optimizer (ie. disable this specific optimization), or somehow manually bypass it?

- Actual minimal-code example: -

void test(float a, float b, float c,
    float &ret0, float &ret1) {
  ret0 = b/a;
  ret1 = c/a;
} 

[my actual code is mostly matrix related algorithms]

Output: VC (cl, version 15, 0x86) is:

divss       xmm0,xmm1 
mulss       xmm2,xmm0 
mulss       xmm1,xmm0 

Having one div, instead of two is a big problem numerically, (xmm0, is preloaded with 1.0f from RAM), as depending on the values of xmm1,2 (which may be in different ranges) you might lose a lot of precision (Compiling without SSE, outputs similar stack-x87-FPU code).

Wrapping the function with

#pragma float_control( precise, on, push )
...
#pragma float_control(pop)

Does solve the accuracy problem, but firstly, it's only available on a function-level (global-scope), and second, it prevents inlining of the function, (ie, speed penalties are too high)

'precise' output is being cast to 'double' back and forth as-well:

 divsd       xmm1,xmm2 
 cvtsd2ss    xmm1,xmm1 
 divsd       xmm1,xmm0 
 cvtpd2ps    xmm0,xmm1 
like image 754
oyd11 Avatar asked Aug 04 '10 16:08

oyd11


2 Answers

Add the

#pragma float_control( precise, on)

before the computation and

#pragma float_control( precise,off)

after that. I think that should do it.

like image 190
Gangadhar Avatar answered Oct 05 '22 20:10

Gangadhar


That document states that you can control the float-pointing optimisations on a line-by-line basis using pragmas.

like image 40
Oliver Charlesworth Avatar answered Oct 05 '22 22:10

Oliver Charlesworth