Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I set __m128i without using of any SSE instruction?

I have many function which use the same constant __m128i values. For example:

const __m128i K8 = _mm_setr_epi8(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16);
const __m128i K16 = _mm_setr_epi16(1, 2, 3, 4, 5, 6, 7, 8);
const __m128i K32 = _mm_setr_epi32(1, 2, 3, 4);

So I want to store all these constants in an one place. But there is a problem: I perform checking of existed CPU extension in run time. If the CPU doesn't support for example SSE (or AVX) than will be a program crash during constants initialization.

So is it possible to initialize these constants without using of SSE?

like image 403
Akira Avatar asked Feb 08 '16 11:02

Akira


People also ask

What is __ m128i type?

__m128 Data Types The __m128i data type can hold sixteen 8-bit, eight 16-bit, four 32-bit, or two 64-bit integer values. The compiler aligns __m128d and __m128i local and global data to 16-byte boundaries on the stack. To align integer, float, or double arrays, you can use the __declspec(align) statement.

What is Immintrin?

The immintrin. h header file defines a set of data types that represent different types of vectors. These are; __m256 : This is a vector of eight floating point numbers (8x32 = 256 bits)


2 Answers

Initialization of __m128i vector without using SSE instructions is possible but it depends on how to compiler defines __m128i.

For Microsoft Visual Studio you can define next macros (it defines __m128i as char[16]):

template <class T> inline char GetChar(T value, size_t index)
{
    return ((char*)&value)[index];
}

#define AS_CHAR(a) char(a)

#define AS_2CHARS(a) \
    GetChar(int16_t(a), 0), GetChar(int16_t(a), 1)

#define AS_4CHARS(a) \
    GetChar(int32_t(a), 0), GetChar(int32_t(a), 1), \
    GetChar(int32_t(a), 2), GetChar(int32_t(a), 3)

#define _MM_SETR_EPI8(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, aa, ab, ac, ad, ae, af) \
    {AS_CHAR(a0), AS_CHAR(a1), AS_CHAR(a2), AS_CHAR(a3), \
     AS_CHAR(a4), AS_CHAR(a5), AS_CHAR(a6), AS_CHAR(a7), \
     AS_CHAR(a8), AS_CHAR(a9), AS_CHAR(aa), AS_CHAR(ab), \
     AS_CHAR(ac), AS_CHAR(ad), AS_CHAR(ae), AS_CHAR(af)}

#define _MM_SETR_EPI16(a0, a1, a2, a3, a4, a5, a6, a7) \
    {AS_2CHARS(a0), AS_2CHARS(a1), AS_2CHARS(a2), AS_2CHARS(a3), \
     AS_2CHARS(a4), AS_2CHARS(a5), AS_2CHARS(a6), AS_2CHARS(a7)}

#define _MM_SETR_EPI32(a0, a1, a2, a3) \
    {AS_4CHARS(a0), AS_4CHARS(a1), AS_4CHARS(a2), AS_4CHARS(a3)}       

For GCC it will be (it defines __m128i as long long[2]):

#define CHAR_AS_LONGLONG(a) (((long long)a) & 0xFF)

#define SHORT_AS_LONGLONG(a) (((long long)a) & 0xFFFF)

#define INT_AS_LONGLONG(a) (((long long)a) & 0xFFFFFFFF)

#define LL_SETR_EPI8(a, b, c, d, e, f, g, h) \
    CHAR_AS_LONGLONG(a) | (CHAR_AS_LONGLONG(b) << 8) | \
    (CHAR_AS_LONGLONG(c) << 16) | (CHAR_AS_LONGLONG(d) << 24) | \
    (CHAR_AS_LONGLONG(e) << 32) | (CHAR_AS_LONGLONG(f) << 40) | \
    (CHAR_AS_LONGLONG(g) << 48) | (CHAR_AS_LONGLONG(h) << 56)

#define LL_SETR_EPI16(a, b, c, d) \
    SHORT_AS_LONGLONG(a) | (SHORT_AS_LONGLONG(b) << 16) | \
    (SHORT_AS_LONGLONG(c) << 32) | (SHORT_AS_LONGLONG(d) << 48)

#define LL_SETR_EPI32(a, b) \
    INT_AS_LONGLONG(a) | (INT_AS_LONGLONG(b) << 32)        

#define _MM_SETR_EPI8(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, aa, ab, ac, ad, ae, af) \
    {LL_SETR_EPI8(a0, a1, a2, a3, a4, a5, a6, a7), LL_SETR_EPI8(a8, a9, aa, ab, ac, ad, ae, af)}

#define _MM_SETR_EPI16(a0, a1, a2, a3, a4, a5, a6, a7) \
    {LL_SETR_EPI16(a0, a1, a2, a3), LL_SETR_EPI16(a4, a5, a6, a7)}

#define _MM_SETR_EPI32(a0, a1, a2, a3) \
    {LL_SETR_EPI32(a0, a1), LL_SETR_EPI32(a2, a3)}        

So in your code initialization of __m128i constant will be look like:

const __m128i K8 = _MM_SETR_EPI8(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16);
const __m128i K16 = _MM_SETR_EPI16(1, 2, 3, 4, 5, 6, 7, 8);
const __m128i K32 = _MM_SETR_EPI32(1, 2, 3, 4);
like image 89
ErmIg Avatar answered Oct 12 '22 04:10

ErmIg


I suggest defining the initialisation data globally as scalar data and then load it locally into a const __m128i:

static const uint8_t gK8[16] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 };

static inline foo()
{
    const __m128i K8 = _mm_loadu_si128((__m128i *)gK8);

    // ...
}
like image 27
Paul R Avatar answered Oct 12 '22 05:10

Paul R