Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling Floating-Point exceptions in C++

I'm finding the floating-point model/error issues quite confusing. It's an area I'm not familiar with and I'm not a low level C/asm programmer, so I would appreciate a bit of advice.

I have a largish C++ application built with VS2012 (VC11) that I have configured to throw floating-point exceptions (or more precisely, to allow the C++ runtime and/or hardware to throw fp-exceptions) - and it is throwing quite a lot of them in the release (optimized) build, but not in the debug build. I assume this is due to the optimizations and perhaps the floating-point model (although the compiler /fp:precise switch is set for both the release and debug builds).

My first question relates to managing the debugging of the app. I want to control where fp-exceptions are thrown and where they are "masked". This is needed because I am debugging the (optimized) release build (which is where the fp-exceptions occur) - and I want to disable fp-exceptions in certain functions where I have detected problems, so I can then locate new FP problems. But I am confused by the difference between using _controlfp_s to do this (which works fine) and the compiler (and #pragma float_control) switch "/fp:except" (which seems to have no effect). What is the difference between these two mechanisms? Are they supposed to have the same effect on fp exceptions?

Secondly, I am getting a number of "Floating-point stack check" exceptions - including one that seems to be thrown in a call to the GDI+ dll. Searching around the web, the few mentions of this exception seem to indicate it is due to compiler bugs. Is this generally the case? If so, how should I work round this? Is it best to disable compiler optimizations for the problem functions, or to disable fp-exceptions just for the problematic areas of code if there don't appear to be any bad floating-point values returned? For example, in the GDI+ call (to GraphicsPath::GetPointCount) that throws this exception, the actual returned integer value seems correct. Currently I'm using _controlfp_s to disable fp-exceptions immediately prior to the GDI+ call – and then use it again to re-enable exceptions directly after the call.

Finally, my application does make a lot of floating-point calculations and needs to be robust and reliable, but not necessarily hugely accurate. The nature of the application is that the floating-point values generally indicate probabilities, so are inherently somewhat imprecise. However, I want to trap any pure logic errors, such as divide by zero. What is the best fp model for this? Currently I am:

  • trapping all fp exceptions (i.e. EM_OVERFLOW | EM_UNDERFLOW | EM_ZERODIVIDE | EM_DENORMAL | EM_INVALID) using _controlfp_s and a SIGFPE Signal handler,
  • have enabled the denormals-are-zero (DAZ) and flush-to-zero (FTZ) (i.e. _MM_SET_FLUSH_ZERO_MODE(_MM_DENORMALS_ZERO_ON)), and
  • I am using the default VC11 compiler settings /fp:precise with /fp:except not specified.

Is this the best model?

Thanks and regards!

like image 555
Jools99 Avatar asked May 29 '13 02:05

Jools99


People also ask

How do you handle floating point exception?

Before you can trap floating-point (FP) exceptions using structured exception handling, you must call the _controlfp_s C run-time library function to turn on all possible FP exceptions. To trap only particular exceptions, use only the flags that correspond to the exceptions to be trapped.

What is a floating exception in C?

Floating point exception (core dumped) is an error that arises when your application tries to do something that is not allowed with a floating point number. In this article, we'll teach you why the floating point exception occurs and how to fix it. Also, we'll give you other information about this error.

What are floating point exceptions?

A floating point exception is an error that occurs when you try to do something impossible with a floating point number, such as divide by zero.

What causes a floating point exception in C?

Conversion from floating-point to integer may cause an "invalid" floating-point exception. If this occurs, the value of that integer is undefined and should not be used.


2 Answers

Most of the the following information comes from Bruce Dawson's blog post on the subject (link).

Since you're working with C++, you can create a RAII class that enables or disables floating point exceptions in a scoped manner. This lets you have greater control so that you're only exposing the exception state to your code, rather than manually managing calling _controlfp_s() yourself. In addition, floating point exception state that is set this way is system wide, so it's really advisable to remember the previous state of the control word and restore it when needed. RAII can take care of this for you and is a good solution for the issues with GDI+ that you're describing.

The exception flags _EM_OVERFLOW, _EM_ZERODIVIDE, and _EM_INVALID are the most important to account for. _EM_OVERFLOW is raised when positive or negative infinity is the result of a calculation, whereas _EM_INVALID is raised when a result is a signaling NaN. _EM_UNDERFLOW is safe to ignore; it signals when your computation result is non-zero and between -FLT_MIN and FLT_MIN (in other words, when you generate a denormal). _EM_INEXACT is raised too frequently to be of any practical use due to the nature of floating point arithmetic, although it can be informative if trying to track down imprecise results in some situations.

SIMD code adds more wrinkles to the mix; since you don't indicate using SIMD explicitly I'll leave out a discussion of that except to note that specifying anything other than /fp:fast can disable automatic vectorization of your code in VS 2012; see this answer for details on this.

like image 79
masrtis Avatar answered Sep 23 '22 20:09

masrtis


I can't help much with the first two questions, but I have experience and a suggestion for the question about masking FPU exceptions.

I've found the functions

_statusfp()  (x64 and Win32)
_statusfp2() (Win32 only)
_fpreset()
_controlfp_s()
_clearfp()
_matherr()

useful when debugging FPU exceptions and in delivering a stable and fast product.

When debugging, I selectively unmask exceptions to help isolate the line of code where an fpu exception is generated in a calculation where I cannot avoid calling other code that unpredictably generates fpu exceptions (like the .NET JIT's divide by zeros).

In released product I use them to deliver a stable program that can tolerate serious floating point exceptions, detect when they occur, and recover gracefully.

I mask all FPU exceptions when I have to call code that cannot be changed,does not have reliable exception handing, and occasionally generates FPU exceptions.

Example:

#define BAD_FPU_EX (_EM_OVERFLOW | _EM_ZERODIVIDE | _EM_INVALID)
#define COMMON_FPU_EX (_EM_INEXACT | _EM_UNDERFLOW | _EM_DENORMAL)
#define ALL_FPU_EX (BAD_FPU_EX | COMMON_FPU_EX)

Release code:

_fpreset();
Use _controlfp_s() to mask ALL_FPU_EX 
_clearfp();
... calculation
unsigned int bad_fpu_ex = (BAD_FPU_EX  & _statusfp());
_clearfp(); // to prevent reacting to existing status flags again
if ( 0 != bad_fpu_ex )
{
  ... use fallback calculation
  ... discard result and return error code
  ... throw exception with useful information
}

Debug code:

_fpreset();
_clearfp();
Use _controlfp_s() to mask COMMON_FPU_EX and unmask BAD_FPU_EX 
... calculation
  "crash" in debugger on the line of code that is generating the "bad" exception.

Depending on your compiler options, release builds may be using intrinsic calls to FPU ops and debug builds may call math library functions. These two methods can have significantly different error handling behavior for invalid operations like sqrt(-1.0).

Using executables built with VS2010 on 64-bit Windows 7, I have generated slightly different double precision arithmetic values when using identical code on Win32 and x64 platforms. Even using non-optimized debug builds with /fp::precise, the fpu precision control explicitly set to _PC_53, and the fpu rounding control explicitly set to _RC_NEAR. I had to adjust some regression tests that compare double precision values to take the platform into account. I don't know if this is still an issue with VS2012, but heads up.

like image 21
Dale Lear Avatar answered Sep 24 '22 20:09

Dale Lear