Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Different intrinsics behaviour depending on GCC version

I'm pretty new to intrinsics and i faced with different behavior of my code with GCC-7.4 and GCC-8.3

My code is pretty simple

b.cpp:

#include <iostream>
#include <xmmintrin.h>

void foo(const float num, const float denom)
{
    const __v4sf num4 = {
        num,
        num,
        num,
        num,
    };
    const __v4sf denom4 = {
        denom,
        denom,
        denom,
        denom,
    };
    float res_arr[] = {0, 0, 0, 0};

    __v4sf *res = (__v4sf*)res_arr;
    *res = num4 / denom4;
    std::cout << res_arr[0] << std::endl;
    std::cout << res_arr[1] << std::endl;
    std::cout << res_arr[2] << std::endl;
    std::cout << res_arr[3] << std::endl;
}

In b.cpp we just basically construct two __v4sf from float variables and performing division

b.h:

#ifndef B_H
#define B_H

void foo(const float num, const float denom);

#endif

a.cpp:

#include "b.h"

int main (void)
{
    const float denominator = 1.0f;
    const float numerator = 12.0f;
    foo(numerator, denominator);
    return 0;
}

Here we just call our function from b.cpp

GCC 7.4 works ok:

g++-7 -c b.cpp -o b.o && g++-7 a.cpp b.o -o a.out && ./a.out
12
12
12
12

But something wrong with GCC 8.3

g++-8 -c b.cpp -o b.o && g++-8 a.cpp b.o -o a.out && ./a.out
inf
inf
inf
inf

So my question is - why i receive different results with different versions of GCC? Is it undefined behavior?

like image 986
Daiver Avatar asked Jun 10 '19 08:06

Daiver


People also ask

What is GCC intrinsics?

Compiler intrinsics (sometimes called "builtins") are like the library functions you're used to, except they're built in to the compiler. They may be faster than regular library functions (the compiler knows more about them so it can optimize better) or handle a smaller input range than the library functions.

What is __ Builtin_?

__builtin_* functions are optimised functions provided by the compiler libraries. These might be builtin versions of standard library functions, such as memcpy, and perhaps more typically some of the maths functions.

How do I check my GCC version?

So if you ever need to check the version of the GCC C++ compiler that you have installed on your PC, you can do it through the command prompt by typing in the single line, g++ --version, and this will return the result.


1 Answers

You've found a bug in gcc8 and later, which happens with/without optimization enabled. Thanks for reporting it.

With optimization enabled it's easy to see what the asm is doing because the __v4sf stuff optimizes away: it's just scalar division and printing the result 4 times. (Plus 4 calls to flush cout because you used std::endl for some reason.)

gcc7 correctly optimizes it to divss xmm0, xmm1 to do num / denom. Then it converts to double because the output functions only take double, not float, passes that to iostream functions. (GCC7 saves the double bit-pattern in integer register r14 instead of memory, with -mtune=skylake. GCC8 and later just use memory which probably makes more sense.)

gcc8 and later does divss xmm0, .LC0[rip] where the constant from memory is 0 (the bit-pattern for +0.0). So it's dividing the num by zero, ignoring denom.

Check it out on the Godbolt compiler explorer.

Using alignas(16) float res_arr[4]; to remove the potential under-alignment of the __v4sf *res doesn't help. (You generally don't need __attribute__((aligned(16))) anymore; C++11 introduced standard syntax for alignment.)


like image 197
Peter Cordes Avatar answered Nov 01 '22 03:11

Peter Cordes