I want to use intrinsics to increment the elements of a SIMD vector. The simplest way seems to be to add 1 to each element, like this: (note:<code>vec_inc</code> has been set to 1 before) <pre class="prettyprint"><code>vec = _mm256_add_epi16 (vec, vec_inc); </code></pre> but is there any special instruction to increment a vector? Like <code>inc</code> in this page ? Or any other easier way ?

The <code>INC</code> instruction is not a SIMD level instruction, it operates on integer scalars. As you and Paul already suggested, the simplest way is to add <code>1</code> to each vector element, which you can do by adding a vector of <code>1</code>s. If you want to simulate an intrinsic, you can implement your own function: <pre class="prettyprint"><code>inline __m256i _mm256_inc_epi16(__m256i a) { return _mm256_add_epi16(a, _mm256_set1_epi16(1)); } </code></pre> <hr> For similar questions on x86 intrinsics in the future, you can find the collection of Intel ISA intrinsics at Intel's Intrinsics Guide. Also see the extensive resources documented under the x86 and sse tag info: <ul> <li>x86 tag info</li> <li>sse tag info</li> </ul>

How to increment a vector in AVX/AVX2

I want to use intrinsics to increment the elements of a SIMD vector. The simplest way seems to be to add 1 to each element, like this:

(note:vec_inc has been set to 1 before)

vec = _mm256_add_epi16 (vec, vec_inc);

but is there any special instruction to increment a vector? Like inc in this page ? Or any other easier way ?

What is the difference between AVX and AVX2?

The only difference between AVX and AVX2 for floating point code is availability of new FMA instruction – both AVX and AVX2 have 256-bit FP registers. The main advantage of new ISA of AVX2 is for integer code/data types – there you can expect up to 2x speedup, but 8% for FP code is good speedup of AVX2 over AVX.

What is __ m256d?

__m256d : This is a vector of four double precistion numbers (4x64 = 256 bits)

What are AVX intrinsics?

AVX provides intrinsic functions that combine one or more values into a 256-bit vector. Table 2 lists their names and provides a description of each. There are similar intrinsics that initialize 128-bit vectors, but those are provided by SSE, not AVX.

The INC instruction is not a SIMD level instruction, it operates on integer scalars. As you and Paul already suggested, the simplest way is to add 1 to each vector element, which you can do by adding a vector of 1s.

If you want to simulate an intrinsic, you can implement your own function:

inline __m256i _mm256_inc_epi16(__m256i a)
{
    return _mm256_add_epi16(a, _mm256_set1_epi16(1));
}

For similar questions on x86 intrinsics in the future, you can find the collection of Intel ISA intrinsics at Intel's Intrinsics Guide. Also see the extensive resources documented under the x86 and sse tag info:

x86 tag info
sse tag info

How to increment a vector in AVX/AVX2

Tags:

x86

assembly

simd

intrinsics

avx2

Hossein Amiri

People also ask

1 Answers

plasmacel

Recent Activity

Donate For Us

How to increment a vector in AVX/AVX2

Tags:

x86

assembly

simd

intrinsics

avx2

Hossein Amiri

People also ask

1 Answers

plasmacel

Related questions

Recent Activity

Donate For Us