How can I divide 16 8-bit integers by 4 (or shift them 2 to the right) using SSE intrinsics?
Unfortunately there are no SSE shift instructions for 8 bit elements. If the elements are 8 bit unsigned then you can use a 16 bit shift and mask out the unwanted high bits, e.g.
v = _mm_srli_epi16(v, 2);
v = _mm_and_si128(v, _mm_set1_epi8(0x3f));
For 8 bit signed elements it's a little fiddlier, but still possible, although it might just be easier to unpack to 16 bits, do the shifts, then pack back to 8 bits.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With