I'm working with SSE intrinsic functions. I have an __m128i representing an array of 8 signed short (16 bit) values.
Is there a function to get the sign of each element?
EDIT1: something that can be used like this:
short tmpVec[8];
__m128i tmp, sgn;
for (i-0;i<8;i++)
tmp.m128i_i16[i] = tmpVec[i]
sgn = _mm_sign_epi16(tmp);
of course "_mm_sign_epi16" doesn't exist, so that's what I'm looking for.
How slow it is to do it element by element?
EDIT2: desired behaviour: 1 for positive values, 0 for zero, and -1 for negative values.
thanks
You can use min/max operations to get the desired result, e.g.
inline __m128i _mm_sgn_epi16(__m128i v)
{
v = _mm_min_epi16(v, _mm_set1_epi16(1));
v = _mm_max_epi16(v, _mm_set1_epi16(-1));
return v;
}
This is probably a little more efficient than explicitly comparing with zero + shifting + combining results.
Note that there is already an _mm_sign_epi16
intrinsic in SSSE3 (PSIGNW
- see tmmintrin.h
), which behaves somewhat differently, so I changed the name for the required function to _mm_sgn_epi16
. Using _mm_sign_epi16
might be more efficient when SSSE3 is available however, so you could do something like this:
inline __m128i _mm_sgn_epi16(__m128i v)
{
#ifdef __SSSE3__
v = _mm_sign_epi16(_mm_set1_epi16(1), v); // use PSIGNW on SSSE3 and later
#else
v = _mm_min_epi16(v, _mm_set1_epi16(1)); // use PMINSW/PMAXSW on SSE2/SSE3.
v = _mm_max_epi16(v, _mm_set1_epi16(-1));
#endif
return v;
}
Fill a register of zeros, and compare it with your register, first with "greater than", than with "lower than" (or invert the order of the operands in the "greater than" instruction).
http://msdn.microsoft.com/en-us/library/xd43yfsa%28v=vs.90%29.aspx
http://msdn.microsoft.com/en-us/library/t863edb2%28v=vs.90%29.aspx
The problem at this point is that the true value is represented as 0xffff, which happens to be -1, correct result for the negative number but not for the positive. However, as pointed out by Raymond Chen in the comments, 0x0000 - 0xffff = 0x0001, so it's enough now to subtract the result of "greater than" from the result of "lower than". http://msdn.microsoft.com/en-us/library/y25yya27%28v=vs.90%29.aspx
Of course Paul R answer is preferable, as it uses only 2 instructions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With