I have just started using SSE and I am confused how to get the maximum integer value (max
) of a __m128i
. For instance:
__m128i t = _mm_setr_ps(0,1,2,3);
// max(t) = 3;
Searching around led me to MAXPS
instruction but I can't seem to find how to use that with "xmmintrin.h"
.
Also, is there any documentation for "xmmintrin.h"
that you would recommend, rather than looking into the header file itself?
In case anyone cares and since intrinsics seem to be the way to go these days here is a solution in terms of intrinsics.
int horizontal_max_Vec4i(__m128i x) {
__m128i max1 = _mm_shuffle_epi32(x, _MM_SHUFFLE(0,0,3,2));
__m128i max2 = _mm_max_epi32(x,max1);
__m128i max3 = _mm_shuffle_epi32(max2, _MM_SHUFFLE(0,0,0,1));
__m128i max4 = _mm_max_epi32(max2,max3);
return _mm_cvtsi128_si32(max4);
}
I don't know if that's any better than this:
int horizontal_max_Vec4i(__m128i x) {
int result[4] __attribute__((aligned(16))) = {0};
_mm_store_si128((__m128i *) result, x);
return max(max(max(result[0], result[1]), result[2]), result[3]);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With