I am working with SSE2 instruction set in MS Visual Studio. I am using it to do some calculations with 16-bit data.
Suppose i have 8 values loaded into a SSE register. I want to add a constant (e.g. 42
) to all of them. Here is how i would like my code to look.
__m128i values; // 8 values, 16 bits each
const __m128i my_const_42 = ???; // What should i write here?
values = _mm_add_epi16(values, my_const_2); // Add 42 to the 8 values
Now, how can i define the constant? The following two ways work, but one is inefficient, and the other is ugly.
my_const_42 = _mm_set_epi16(42, 42, 42, 42, 42, 42, 42, 42)
- compiler generates 8 commands to "build" the constantmy_const_42 = {42, 0, 42, 0, 42, 0, 42, 0, 42, 0, 42, 0, 42, 0, 42, 0}
- hard to understand what is going on; changing 42
to e.g. -42
is not trivialIs there any way to express the 128-bit constant more conveniently?
As an extension the integer scalar type __int128 is supported for targets which have an integer mode wide enough to hold 128 bits. Simply write __int128 for a signed 128-bit integer, or unsigned __int128 for an unsigned 128-bit integer.
The 128-bit data type can handle up to 31 significant digits (compared to 17 handled by the 64-bit long double). However, while this data type can store numbers with more precision than the 64-bit data type, it does not store numbers of greater magnitude.
Ninety percent of the battle is finding the correct intrinsic. The MSDN Library is pretty well organized, start at this page. From there, drill down like this:
Set is golden, out pops _mm_set1_epi16 (short w)
Something to note about creating constants in SSE (or NEON). Loading data from memory is extremely slow compared to instruction execution. If you need a constant which is possible to create through code, then that's the faster choice. Here are some examples of constants created through code:
xmmTemp = _mm_cmpeq_epi16(xmmA, xmmA); // FFFF
xmmTemp = _mm_slli_epi16 (mmxTemp, 7); // now it has 0xFF80 (-128)
xmmTemp = _mm_cmpeq_epi16(xmmA, xmmA); // FFFF
xmmTemp = _mm_slli_epi16 (mmxTemp, 15); // 0x8000
xmmTemp = _mm_srli_epi16 (mmxTemp, 11); // 0x10 (positive 16)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With