In SSE there is a function _mm_cvtepi32_ps(__m128i input)
which takes input vector of 32 bits wide signed integers (int32_t
) and converts them into float
s.
Now, I want to interpret input integers as not signed. But there is no function _mm_cvtepu32_ps
and I could not find an implementation of one. Do you know where I can find such a function or at least give a hint on the implementation?
To illustrate the the difference in results:
unsigned int a = 2480160505; // 10010011 11010100 00111110 11111001
float a1 = a; // 01001111 00010011 11010100 00111111;
float a2 = (signed int)a; // 11001110 11011000 01010111 10000010
I think Paul's answer is nice, but it fails for v=4294967295U (=2^32-1). In that case v2=2^31-1 and v1=2^31. Intrinsic _mm_cvtepi32_ps converts 2^31 to -2.14748365E9 . v2=2^31-1 is converted to 2.14748365E9 and consequently _mm_add_ps
returns 0 (due to rounding v1f and v2f are the exact opposite of each other).
The idea of the solution below is to copy the most significant bit of v to v_high. The other bits of v are copied to v_low. v_high is converted to 0 or 2.14748365E9 .
inline __m128 _mm_cvtepu32_v3_ps(const __m128i v)
{
__m128i msk0=_mm_set1_epi32(0x7FFFFFFF);
__m128i zero=_mm_xor_si128(msk0,msk0);
__m128i cnst2_31=_mm_set1_epi32(0x4F000000); /* IEEE representation of float 2^31 */
__m128i v_high=_mm_andnot_si128(msk0,v);
__m128i v_low=_mm_and_si128(msk0,v);
__m128 v_lowf=_mm_cvtepi32_ps(v_low);
__m128i msk1=_mm_cmpeq_epi32(v_high,zero);
__m128 v_highf=_mm_castsi128_ps(_mm_andnot_si128(msk1,cnst2_31));
__m128 v_sum=_mm_add_ps(v_lowf,v_highf);
return v_sum;
}
It was possible to reduce the number of instructions:
inline __m128 _mm_cvtepu32_v4_ps(const __m128i v)
{
__m128i msk0=_mm_set1_epi32(0x7FFFFFFF);
__m128i cnst2_31=_mm_set1_epi32(0x4F000000);
__m128i msk1=_mm_srai_epi32(v,31);
__m128i v_low=_mm_and_si128(msk0,v);
__m128 v_lowf=_mm_cvtepi32_ps(v_low);
__m128 v_highf=_mm_castsi128_ps(_mm_and_si128(msk1,cnst2_31));
__m128 v_sum=_mm_add_ps(v_lowf,v_highf);
return v_sum;
}
Intrinsic _mm_srai_epi32
shifts the most significant bit of v to the right, while shifting in sign bits, which turns out to be quite useful here.
This functionality exists in AVX-512, but if you can't wait until then the only thing I can suggest is to convert the unsigned int
input values into pairs of smaller values, convert these, and then add them together again, e.g.
inline __m128 _mm_cvtepu32_ps(const __m128i v)
{
__m128i v2 = _mm_srli_epi32(v, 1); // v2 = v / 2
__m128i v1 = _mm_sub_epi32(v, v2); // v1 = v - (v / 2)
__m128 v2f = _mm_cvtepi32_ps(v2);
__m128 v1f = _mm_cvtepi32_ps(v1);
return _mm_add_ps(v2f, v1f);
}
UPDATE
As noted by @wim in his answer, the above solution fails for an input value of UINT_MAX
. Here is a more robust, but slightly less efficient solution, which should work for the full uint32_t
input range:
inline __m128 _mm_cvtepu32_ps(const __m128i v)
{
__m128i v2 = _mm_srli_epi32(v, 1); // v2 = v / 2
__m128i v1 = _mm_and_si128(v, _mm_set1_epi32(1)); // v1 = v & 1
__m128 v2f = _mm_cvtepi32_ps(v2);
__m128 v1f = _mm_cvtepi32_ps(v1);
return _mm_add_ps(_mm_add_ps(v2f, v2f), v1f); // return 2 * v2 + v1
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With