How to find the max member in a __m128(F32vec4)

Question

Something like this:

_declspec(align(16)) float dens[4];

//Here the code comes. F32vec4 S_START, Pos, _Vector

*((__m128*)dens) = (S_START - Pos) *_Vector;

float steps = max(max(dens[3], dens[2]), max(dens[1], dens[0]));

How do I do this directly using SSE?

Mysticial · Accepted Answer

There's no easy way to do this. SSE isn't particularly meant for horizontal operations. So you have to shuffle...

Here's one approach:

__m128 a = _mm_set_ps(10,9,7,8);

__m128 b = _mm_shuffle_ps(a,a,78);  //  {a,b,c,d} -> {c,d,a,b}
a = _mm_max_ps(a,b);

b = _mm_shuffle_ps(a,a,177);        //  {a,b,c,d} -> {b,a,d,c}
a = _mm_max_ss(a,b);

float out;
_mm_store_ss(&out,a);

I note that the final store isn't really supposed to be a store. It's just a hack to get the value into the float datatype.

In reality no instruction is needed because float types will be stored in the same SSE registers. (It's just that the top 3 values are ignored.)

How to find the max member in a __m128(F32vec4)

Tags:

c

simd

sse

user1468756

1 Answers

Mysticial

Recent Activity

Donate For Us

How to find the max member in a __m128(F32vec4)

Tags:

c

simd

sse

user1468756

1 Answers

Mysticial

Related questions

Recent Activity

Donate For Us