Auto-vectorizing: Convincing the compiler that alias check is not necessary

I am doing some image processing, for which I benefit from vectorization. I have a function that vectorizes ok, but for which I am not able to convince the compiler that the input and output buffer have no overlap, and so no alias checking is necessary. I should be able to do so using __restrict__, but if the buffers are not defined as __restrict__ when arriving as function argument, there is no way to convince the compiler that I am absolutely sure that 2 buffers will never overlap.

This is the function:

__attribute__((optimize("tree-vectorize","tree-vectorizer-verbose=6")))
void threshold(const cv::Mat& inputRoi, cv::Mat& outputRoi, const unsigned char th) {

    const int height = inputRoi.rows;
    const int width = inputRoi.cols;

    for (int j = 0; j < height; j++) {
        const uint8_t* __restrict in = (const uint8_t* __restrict) inputRoi.ptr(j);
        uint8_t* __restrict out = (uint8_t* __restrict) outputRoi.ptr(j);
        for (int i = 0; i < width; i++) {
           out[i] = (in[i] < valueTh) ? 255 : 0;
        }
    }
}

The only way I can convince the compiler to not perform the alias checking is if I put the inner loop in a separate function, in which the pointers are defined as __restrict__ arguments. If I declare this inner function as inlined, again the alias checking is activated.

You can see the effect also with this example, which I think is consistent: http://goo.gl/7HK5p7

(Note: I know there might be better ways of writing the same function, but in this case I am just trying to understand how to avoid alias check)

Edit:
Problem is solved!! (See answer below)
Using gcc 4.9.2, here is the complete example. Note the use of the compiler flag -fopt-info-vec-optimized in place of the superseded -ftree-vectorizer-verbose=N.
So, for gcc, use #pragma GCC ivdep and enjoy! :)

How do you use vectorization in C++?

There are two ways to vectorize a loop computation in a C/C++ program. Programmers can use intrinsics inside the C/C++ source code to tell compilers to generate specific SIMD instructions so as to vectorize the loop computation. Or, compilers may be setup to vectorize the loop computation automatically.

What is GCC vectorization?

GCC Autovectorization flagsGCC is an advanced compiler, and with the optimization flags -O3 or -ftree-vectorize the compiler will search for loop vectorizations (remember to specify the -mavx flag too). The source code remains the same, but the compiled code by GCC is completely different.

What is vectorization CPP?

Vectorization is the use of vector instructions to speed up program execution. Vectorization can be done both by programmers by explicitly writing vector instructions and by a compiler. The latter case is called Auto Vectorization .

if you are using Intel compiler, you can try to include the line:

#pragma ivdep

The following paragraph is quoted from Intel compiler user manual:

The ivdep pragma instructs the compiler to ignore assumed vector dependencies. To ensure correct code, the compiler treats an assumed dependence as a proven dependence, which prevents vectorization. This pragma overrides that decision. Use this pragma only when you know that the assumed loop dependencies are safe to ignore.

In gcc, one should add the line:

#pragma GCC ivdep

inside the function and right before the loop you want to vectorize (see documentation). This is only supported starting from gcc 4.9 and, by the way, makes the use of __restrict__ redundant.

Auto-vectorizing: Convincing the compiler that alias check is not necessary

Tags:

Antonio

People also ask

1 Answers

PhD AP EcE

Recent Activity

Donate For Us

Auto-vectorizing: Convincing the compiler that alias check is not necessary

Tags:

Antonio

People also ask

1 Answers

PhD AP EcE

Related questions

Recent Activity

Donate For Us