I need to find an index/position of an 8-bit value element N in an array ARR via SIMD. It must be a fast fashion.
For now the algorithm is that I'd load 8-bit values of ARR into one SIMD register and a character code of N into other SIMD register.
Then I'd use negation and check which byte is successful with popcnt.
Is there a faster way?
The operations may be saturated used if needed.
Which instruction set/architecture are you using? That will somewhat impact the 'correct' answer to this question.
in SSE:
#include <immintrin.h>
#include <stdio.h>
int byteIndex(__m128i ARR, __m128i N)
{
__m128i cmp = _mm_cmpeq_epi8(ARR, N);
int mask = _mm_movemask_epi8(cmp);
return _tzcnt_u32(mask);
}
int main()
{
__m128i ARR = _mm_setr_epi8(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15);
// test case that will work
__m128i N = _mm_set1_epi8(3);
printf("%d\n", byteIndex(ARR, N)); ///< prints '3'
// test case that will fail
__m128i F = _mm_set1_epi8(16);
printf("%d\n", byteIndex(ARR, F)); ///< prints '32'
return 1;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With