indexing into an array with SSE

Tags:

Suppose I have an array:

uint8_t arr[256];

and an element

__m128i x

containing 16 bytes,

x_1, x_2, ... x_16

I would like to efficiently fill a new __m128i element

__m128i y

with values from arr depending on the values in x, such that:

y_1  = arr[x_1]
y_2  = arr[x_2]
   .
   .
   .
y_16 = arr[x_16]

A command to achieve this would essentially be loading a register from a non-contiguous set of memory locations. I have a painfully vague memory of having seen documentation of such a command, but can't find it now. Does it exist? Thanks in advance for your help.

277

asked Dec 19 '10 16:12

Travis

1 Answers

This kind of capability in SIMD architectures is known as load/store scatter/gather. Unfortunately SSE does not have it. Future SIMD architectures from Intel may have this - the ill-fated Larrabee processor was one case in point. For now though you will just need to design your data structures in such a way that this kind of functionality is not needed.

Note that you can achieve the equivalent effect by using e.g. _mm_set_epi8:

y = _mm_set_epi8(arr[x_16], arr[x_15], arr[x_14], ..., arr[x_1]);

although of course this will just generate a bunch of scalar code to load your y vector. This is fine if you are doing this kind of operation outside any performance-critical loops, e.g. as part of initialisation prior to looping, but inside a loop it is likely to be a performance-killer.

101

answered Oct 22 '22 02:10

Paul R

Related questions
                            
                                Translating source code into a foreign language
                            
                                reference counts in a Python C extension
                            
                                g++ optimization options affect the value of sin function
                            
                                Must an unused volatile parameter be honoured?
                            
                                Dijkstra's Algorithm: Why is it needed to find minimum-distance element in the queue
                            
                                multithreaded epoll
                            
                                Is this method of pointer tagging in C standard-compliant?
                            
                                Why is my more complicated C loop faster?
                            
                                Error Invalid use of void expression
                            
                                GCC 5.1 warns cancel construct within `parallel for` construct
                            
                                Incomplete array type?
                            
                                Efficient conversion data one integer type to another with the same representation
                            
                                are "too large" objects with automatic storage duration undefined behaviour?
                            
                                Why does an fread loop require an extra Ctrl+D to signal EOF with glibc?
                            
                                Why is tailcall optimization not performed for types of class MEMORY?
                            
                                Dynamic linking - Linux Vs. Windows
                            
                                Macro for iterating over a GList
                            
                                Why can't I use sizeof() in a #if? [duplicate]
                            
                                gcov and switch statements
                            
                                Calling C function from Perl within embedded C application

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

indexing into an array with SSE

Tags:

c

simd

sse

Travis

People also ask

1 Answers

Paul R

Recent Activity

Donate For Us