Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to perform element-wise left shift with __m128i?

Tags:

c

avx

sse

The SSE shift instructions I have found can only shift by the same amount on all the elements:

  • _mm_sll_epi32()
  • _mm_slli_epi32()

These shift all elements, but by the same shift amount.

Is there a way to apply different shifts to the different elements? Something like this:

__m128i a,  __m128i b;  

r0:=    a0  <<  b0;
r1:=    a1  <<  b1;
r2:=    a2  <<  b2;
r3:=    a3  <<  b3;
like image 313
user1468756 Avatar asked Jan 16 '23 07:01

user1468756


2 Answers

There exists the _mm_shl_epi32() intrinsic that does exactly that.

http://msdn.microsoft.com/en-us/library/gg445138.aspx

However, it requires the XOP instruction set. Only AMD Bulldozer and Interlagos processors or later have this instruction. It is not available on any Intel processor.

If you want to do it without XOP instructions, you will need to do it the hard way: Pull them out and do them one by one.

Without XOP instructions, you can do this with SSE4.1 using the following intrinsics:

  • _mm_insert_epi32()
  • _mm_extract_epi32()

http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011/compiler_c/intref_cls/common/intref_sse41_reg_ins_ext.htm

Those will let you extract parts of a 128-bit register into regular registers to do the shift and put them back.

If you go with the latter method, it'll be horrifically inefficient. That's why _mm_shl_epi32() exists in the first place.

like image 91
Mysticial Avatar answered Jan 26 '23 01:01

Mysticial


Without XOP, your options are limited. If you can control the format of the shift count argument, then you can use _mm_mullo_pi16 since multiplying by a power of two is the same as shifting by that power.

For example, if you want to shift your 8 16-bit elements in an SSE register by <0, 1, 2, 3, 4, 5, 6, 7> you can multiply by 2 raised to the shift count powers, i.e., by <0, 2, 4, 8, 16, 32, 64, 128>.

like image 33
mattst88 Avatar answered Jan 26 '23 01:01

mattst88