Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Usage of _mm_shuffle_epi8 intrinsic

Can someone please explain the _mm_shuffle_epi8 SSSE3 intrinsic? I know it shuffles 16 8-bit integers in an __m128i but not sure how I could use this.

I basically want to use _mm_shuffle_epi8 to modify the function below to get better performance.

while(not done)
    dest[i+0] = (src+j).a;
    dest[i+1] = (src+j).b;
    dest[i+2] = (src+j).c;
    dest[i+3] = (src+j+1).a;
    dest[i+4] = (src+j+1).b;
    dest[i+5] = (src+j+1).c;
    i+=6;
    j+=2;
like image 781
Peter Lee Avatar asked Oct 08 '12 09:10

Peter Lee


2 Answers

_mm_shuffle_epi8 (better known as pshufb), essentially does this:

temp = dst;
for (int i = 0; i < 16; i++)
    dst[i] = (src[i] & 0x80) == 0 ? temp[src[i] & 15] : 0;

As for whether you can use it here, it's impossible to tell without knowing the types involved. It won't be "nice" anyway because the destination is a block of 6 bytes (or words? or dwords?). You could make that work by unrolling and doing a lot of shifting and or-ing.

like image 199
harold Avatar answered Oct 10 '22 09:10

harold


here's an example of using the intrinsic; you'll have to find out how to apply it to your particular situation. this code endian-swaps 4 32-bit integers at a time:

unsigned int *bswap(unsigned int *destination, unsigned int *source, int length) {
    int i;
    __m128i mask = _mm_set_epi8(12, 13, 14, 15, 8, 9, 10, 11, 4, 5, 6, 7, 0, 1, 2, 3);
    for (i = 0; i < length; i += 4) {
        _mm_storeu_si128((__m128i *)&destination[i],
        _mm_shuffle_epi8(_mm_loadu_si128((__m128i *)&source[i]), mask));
    }
    return destination;
}
like image 24
jcomeau_ictx Avatar answered Oct 10 '22 09:10

jcomeau_ictx