gcc auto vectorization control flow in loop

Question

In the code below, why is the second loop able to be auto vectorized but the first cannot? How can I modify the code so it does auto vectorize? gcc says:

note: not vectorized: control flow in loop.

I am using gcc 8.2, flags are -O3 -fopt-info-vec-all. I am compiling for x86-64 avx2.

#include <stdlib.h>
#include <math.h>

void foo(const float * x, const float * y, const int * v, float * vec, float * novec, size_t size) {
    size_t i;
    float bar;
    for (i=0 ; i<size ; ++i){
        bar = x[i] - y[i];
        novec[i] = v[i] ? bar : NAN;
    }
    for (i=0 ; i<size ; ++i){
        bar = x[i];
        vec[i] = v[i] ? bar : NAN;
    }
}

Update: This does autovectorize:

for (i=0 ; i<size ; ++i){
    bar = x[i];
    novec[i] = v[i] ? bar : NAN;
    novec[i] -= y[i];
}

I would still like to know why gcc says control flow for the first loop.

Peter Cordes · Accepted Answer

clang auto-vectorizes even the first loop, but gcc8.2 doesn't. (https://godbolt.org/z/cnlwuO)

gcc vectorizes with -ffast-math. Perhaps it's worried about preserving FP exception flag status from the subtraction?

-fno-trapping-math is sufficient for gcc to auto-vectorize (without the rest of what -ffast-math sets), so apparently it's worried about FP exceptions. (https://godbolt.org/z/804ykV). I think it's being over-cautious, because the C source does compute bar every time, whether or not it's used.

gcc will auto-vectorize simple FP a[i] = b[i]+c[i] loops without any FP math options.

gcc auto vectorization control flow in loop

Tags:

c

gcc

auto-vectorization

avx2

user2133814

1 Answers

Peter Cordes

Recent Activity

Donate For Us

gcc auto vectorization control flow in loop

Tags:

c

gcc

auto-vectorization

avx2

user2133814

1 Answers

Peter Cordes

Related questions

Recent Activity

Donate For Us