Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How would you write code for unsigned addition likely to be optimized into one SSE instruction?

Tags:

c++

c

sse

In C or C++ how would you write code for unsigned addition of two arrays likely to be optimized, by say GCC, into one 128bit SSE unsigned addition instruction?

like image 233
PatrickBateman Avatar asked May 24 '11 18:05

PatrickBateman


1 Answers

// N number of ints to be added
// a, b input array
// c sum array
// nReg number of required vector registers

const unsigned nReg = N*sizeof(uint32_t)/sizeof(__v4si);
__v4si a[nReg], b[nReg], c[nReg];
for (unsigned i=0; i<nReg; ++i)
    c[i] = _mm_add_epi32(a[i], b[i]);

// in c++ simply
for (unsigned i=0; i<nReg; ++i)
    c[i] = a[i] + b[i];

Unroll loop and prefetch elements at your desire. Profiling is recommended. Substitute __v4si with __v16qi, __v8hi, __v2di for 8, 16, 64 bit ints.

like image 144
Gunther Piez Avatar answered Oct 19 '22 12:10

Gunther Piez