Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid floating point exceptions in unused SIMD lanes

I like to run my code with floating point exceptions enabled. I do this under Linux using:

feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW );

So far so good.

The issue I am having, is that sometimes the compiler (I use clang8) decides to use SIMD instructions to do a scalar division. Fine, if that is faster, even for a single scalar, why not.

But the result is that an unused lane in the SIMD register can contain a zero.

And when the SIMD division is executed, a floating point exception is thrown.

Does that mean that floating point exceptions cannot be used at all if you allow the compiler to use sse/avx extensions?

In my case, this line of C code:

float a0, min, a, d;
...
a0 = (min - a) / (d);

...is exectuted as:

divps  %xmm2,%xmm3

Which then throws a:

Thread 1 "noisetuner" received signal SIGFPE, Arithmetic exception.
like image 826
Bram Avatar asked Jul 28 '20 01:07

Bram


People also ask

How do you handle floating point exception?

You can run into computations that lead to division by zero when you are working on an array using a for loop. One of the best ways to do this is to check if the denominator is zero before you divide. This will avoid the floating-point exception error.

Why do we get floating point exception?

A floating point exception is an error that occurs when you try to do something impossible with a floating point number, such as divide by zero. In fluent floating point error can be caused by many factors such as, improper mesh size, defining some property close to zero.

What is floating point exception Sigfpe?

When a floating-point exception raises the SIGFPE signal, the process terminates and produces a core file if no signal-handler subroutine is present in the process. Otherwise, the process calls the signal-handler subroutine. Floating-point exception subroutines.


1 Answers

I think you have found a bug in clang or maybe in llvm.

Here’s how I have reproduced, clang 10.0 emits the same code i.e. has that bug as well. Clearly, that vdivps instruction only has valid data in the initial 2 lanes of the vectors, and in the higher 2 lanes it will run 0.0 / 0.0, thus you’ll get a runtime exception if you enable these interrupts in mxcsr register like you’re doing.

Microsoft, Intel and gcc don’t emit divps for that code. If you can, switch to gcc and it should be good.

Update: Clang 10+ has an option controlling such optimizations, -ffp-exception-behavior=maytrap, take a look: https://godbolt.org/z/WG7bEE

like image 114
Soonts Avatar answered Oct 17 '22 21:10

Soonts