Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

not vectorized: not suitable for gather D.32476_34 = *D.32475_33;

I want to have my code autovectorized by compiler, but I can't seem to get it right. In particular the message I am getting from it with -ftree-vectorizer-verbose=6 option on is 125: not vectorized: not suitable for gather D.32476_34 = *D.32475_33;.

Now my question is what whole this message means and what do those numbers stand for ?

Bellow, I have created a simple test example that produces the same message, so I assume the issues will be related.

static void not_suitable_for_gather(unsigned char * __restrict__ pixels, int * __restrict__ indices, int indices_num)
{   
  for (int i = 0; i < indices_num; ++i)
  {
    int idx = indices[i] * 4;

    float r = pixels[idx + 0];
    float g = pixels[idx + 1];
    float b = pixels[idx + 2];
    float a = pixels[idx + 3] / 255.0f;

    pixels[idx + 0] = r;
    pixels[idx + 1] = g;
    pixels[idx + 2] = b;
    pixels[idx + 3] = a * 255.0f;
  }

  return;
}

Also, while creating my example, I came across a whole bunch of other messages, that I am not really sure about their meaning or why would the particular construct be problematic to vectorize, so is there any guide, book, tutorial, blog, whatever that would explain these things to me ?

If that matters, I am using MingW 4.7 32-bit with QtCreator 2.7.0.

EDIT: The conclusion:

According to my tests and suggestions from this post, the message is most likely related to accessing data indirectly via an auxiliary index array, which leads to gather/scatter addressing scheme and at present GCC is not able (or does not want) to vectorize this. I was able to produce vectorized code with clang++ 3.2-1 though.

like image 980
jcxz Avatar asked Oct 21 '22 05:10

jcxz


People also ask

What does it mean to vectorize data?

Vectorization is the process of converting an algorithm from operating on a single value at a time to operating on a set of values (vector) at one time. Modern CPUs provide direct support for vector operations where a single instruction is applied to multiple data (SIMD).

What does it mean to vectorize a loop?

• Loop vectorization transforms a program so that the. same operation is performed at the same time on several. vector elements. for (i=0; i<n; i++) c[i] = a[i] + b[i];

What is GCC vectorization?

GCC Autovectorization flagsGCC is an advanced compiler, and with the optimization flags -O3 or -ftree-vectorize the compiler will search for loop vectorizations (remember to specify the -mavx flag too). The source code remains the same, but the compiled code by GCC is completely different.

What is vectorization CPP?

Vectorization is the use of vector instructions to speed up program execution. Vectorization can be done both by programmers by explicitly writing vector instructions and by a compiler. The latter case is called Auto Vectorization .


1 Answers

A vectorized version of your code would conceptually look like (using OpenCL syntax):

for (int i = 0; i < indices_num; ++i)
{
  int idx = indices[i] * 4;
  float4 factor = (1, 1, 1, 255.0f);

  char4 x1 = vload4(idx, pixels); // Line A
  float4 x2 = convert_float4(x1);
  float4 x3 = x2 / factor;
  float4 x4 = x3 * factor;
  char4 x5 = convert_char4(x4);
  vstore4(x5, idx, pixels); // Line B
}

But hold on; in line A you try to load four chars (aka uint8) from memory, and to store them on line B. That's not a common capability with x86; the only instruction sets I know of that support it are AVX2-enabled (Intel Haswells and later) and Xeon Phi's. Unless you're compiling to one of those, that could explain why your compiler rejects this vectorization opportunity.

The compiler can of course individually load 4 uint8s, build a vector from them, do the required vector operations, and manually store 4 values back; but I'm guessing that without gathers and scatters, individually loading and storing the values was probably deemed too expensive compared with the amount of actual work you save by vectorizing.

like image 176
Oak Avatar answered Oct 27 '22 20:10

Oak