Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

indexing into an array with SSE

Tags:

c

simd

sse

Suppose I have an array:

uint8_t arr[256];

and an element

__m128i x

containing 16 bytes,

x_1, x_2, ... x_16

I would like to efficiently fill a new __m128i element

__m128i y

with values from arr depending on the values in x, such that:

y_1  = arr[x_1]
y_2  = arr[x_2]
   .
   .
   .
y_16 = arr[x_16]

A command to achieve this would essentially be loading a register from a non-contiguous set of memory locations. I have a painfully vague memory of having seen documentation of such a command, but can't find it now. Does it exist? Thanks in advance for your help.

like image 277
Travis Avatar asked Dec 19 '10 16:12

Travis


People also ask

Can you index an array?

Array indexing is the same as accessing an array element. You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

What does it mean to index into an array?

Definition. An array is an indexed collection of data elements of the same type. 1) Indexed means that the array elements are numbered (starting at 0). 2) The restriction of the same type is an important one, because arrays are stored in consecutive memory cells.

How does indexing an array work?

Indexing is an operation that pulls out a select set of values from an array. The index of a value in an array is that value's location within the array. There is a difference between the value and where the value is stored in an array.

What is array index with example?

An array is an ordered list of values that you refer to with a name and an index. For example, consider an array called emp , which contains employees' names indexed by their numerical employee number. So emp[0] would be employee number zero, emp[1] employee number one, and so on.


1 Answers

This kind of capability in SIMD architectures is known as load/store scatter/gather. Unfortunately SSE does not have it. Future SIMD architectures from Intel may have this - the ill-fated Larrabee processor was one case in point. For now though you will just need to design your data structures in such a way that this kind of functionality is not needed.

Note that you can achieve the equivalent effect by using e.g. _mm_set_epi8:

y = _mm_set_epi8(arr[x_16], arr[x_15], arr[x_14], ..., arr[x_1]);

although of course this will just generate a bunch of scalar code to load your y vector. This is fine if you are doing this kind of operation outside any performance-critical loops, e.g. as part of initialisation prior to looping, but inside a loop it is likely to be a performance-killer.

like image 101
Paul R Avatar answered Oct 22 '22 02:10

Paul R