Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

gcc auto vectorization control flow in loop

In the code below, why is the second loop able to be auto vectorized but the first cannot? How can I modify the code so it does auto vectorize? gcc says:

note: not vectorized: control flow in loop.

I am using gcc 8.2, flags are -O3 -fopt-info-vec-all. I am compiling for x86-64 avx2.

#include <stdlib.h>
#include <math.h>

void foo(const float * x, const float * y, const int * v, float * vec, float * novec, size_t size) {
    size_t i;
    float bar;
    for (i=0 ; i<size ; ++i){
        bar = x[i] - y[i];
        novec[i] = v[i] ? bar : NAN;
    }
    for (i=0 ; i<size ; ++i){
        bar = x[i];
        vec[i] = v[i] ? bar : NAN;
    }
}

Update: This does autovectorize:

for (i=0 ; i<size ; ++i){
    bar = x[i];
    novec[i] = v[i] ? bar : NAN;
    novec[i] -= y[i];
}

I would still like to know why gcc says control flow for the first loop.

like image 503
user2133814 Avatar asked Nov 08 '18 14:11

user2133814


1 Answers

clang auto-vectorizes even the first loop, but gcc8.2 doesn't. (https://godbolt.org/z/cnlwuO)

gcc vectorizes with -ffast-math. Perhaps it's worried about preserving FP exception flag status from the subtraction?

-fno-trapping-math is sufficient for gcc to auto-vectorize (without the rest of what -ffast-math sets), so apparently it's worried about FP exceptions. (https://godbolt.org/z/804ykV). I think it's being over-cautious, because the C source does compute bar every time, whether or not it's used.

gcc will auto-vectorize simple FP a[i] = b[i]+c[i] loops without any FP math options.

like image 153
Peter Cordes Avatar answered Sep 18 '22 10:09

Peter Cordes