Automatic vectorization with g++ of a loop with bit operations

Question

Is it possible to vectorize this loop (with g++)?

char x;
int k;
for(int s = 0; s < 4; s++) {
  A[k++] += B[x&3];
  x >>= 2;
}

A and B are pointers to non-overlapping float arrays; B has indices 0 to 3. I need to maximize portability as this is for an R package, so the best would be to rewrite in such a way that g++ would be able to vectorize it alone, as I don’t know how to make SSE code portable in this context (the package RcppEigen makes the library Eigen available so it is possible).

Many thanks in advance for your thoughts.

P.S. The code in which it is nested looks like

int k = 0;
for(size_t j = 0; j < J; j++) {
  char x = data[j];
  for(int s = 0; s < 4; s++) {
    A[k++] += B[x&3];
    x >>= 2;
  }
}

ErmIg · Accepted Answer

There is a solution with using of AVX2 :

__m256 _B = _mm256_setr_ps(B[0], B[1], B[2], B[3], B[0], B[1], B[2], B[3]);
__m256i _shift = _mm256_setr_epi32(0, 2, 4, 6, 8, 10, 12, 14);
__m256i _mask = _mm256_set1_epi32(3);
for (size_t j = 0; j < J/2; j++)
{
    short x = ((short*)data)[j];
    __m256i _index = _mm256_and_si256(_mm256_srlv_epi32(_mm256_set1_epi32(x), _shift), _mask);
    _mm256_storeu_ps(A, _mm256_add_ps(_mm256_loadu_ps(A), _mm256_permutevar8x32_ps(_B, _index)));
    A += 8;
}

Automatic vectorization with g++ of a loop with bit operations

Tags:

c++

vectorization

g++

simd

Elvis

1 Answers

ErmIg

Recent Activity

Donate For Us

Automatic vectorization with g++ of a loop with bit operations

Tags:

c++

vectorization

g++

simd

Elvis

1 Answers

ErmIg

Related questions

Recent Activity

Donate For Us