I have many function which use the same constant __m128i values. For example:
const __m128i K8 = _mm_setr_epi8(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16);
const __m128i K16 = _mm_setr_epi16(1, 2, 3, 4, 5, 6, 7, 8);
const __m128i K32 = _mm_setr_epi32(1, 2, 3, 4);
So I want to store all these constants in an one place. But there is a problem: I perform checking of existed CPU extension in run time. If the CPU doesn't support for example SSE (or AVX) than will be a program crash during constants initialization.
So is it possible to initialize these constants without using of SSE?
__m128 Data Types The __m128i data type can hold sixteen 8-bit, eight 16-bit, four 32-bit, or two 64-bit integer values. The compiler aligns __m128d and __m128i local and global data to 16-byte boundaries on the stack. To align integer, float, or double arrays, you can use the __declspec(align) statement.
The immintrin. h header file defines a set of data types that represent different types of vectors. These are; __m256 : This is a vector of eight floating point numbers (8x32 = 256 bits)
Initialization of __m128i vector without using SSE instructions is possible but it depends on how to compiler defines __m128i.
For Microsoft Visual Studio you can define next macros (it defines __m128i as char[16]):
template <class T> inline char GetChar(T value, size_t index)
{
return ((char*)&value)[index];
}
#define AS_CHAR(a) char(a)
#define AS_2CHARS(a) \
GetChar(int16_t(a), 0), GetChar(int16_t(a), 1)
#define AS_4CHARS(a) \
GetChar(int32_t(a), 0), GetChar(int32_t(a), 1), \
GetChar(int32_t(a), 2), GetChar(int32_t(a), 3)
#define _MM_SETR_EPI8(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, aa, ab, ac, ad, ae, af) \
{AS_CHAR(a0), AS_CHAR(a1), AS_CHAR(a2), AS_CHAR(a3), \
AS_CHAR(a4), AS_CHAR(a5), AS_CHAR(a6), AS_CHAR(a7), \
AS_CHAR(a8), AS_CHAR(a9), AS_CHAR(aa), AS_CHAR(ab), \
AS_CHAR(ac), AS_CHAR(ad), AS_CHAR(ae), AS_CHAR(af)}
#define _MM_SETR_EPI16(a0, a1, a2, a3, a4, a5, a6, a7) \
{AS_2CHARS(a0), AS_2CHARS(a1), AS_2CHARS(a2), AS_2CHARS(a3), \
AS_2CHARS(a4), AS_2CHARS(a5), AS_2CHARS(a6), AS_2CHARS(a7)}
#define _MM_SETR_EPI32(a0, a1, a2, a3) \
{AS_4CHARS(a0), AS_4CHARS(a1), AS_4CHARS(a2), AS_4CHARS(a3)}
For GCC it will be (it defines __m128i as long long[2]):
#define CHAR_AS_LONGLONG(a) (((long long)a) & 0xFF)
#define SHORT_AS_LONGLONG(a) (((long long)a) & 0xFFFF)
#define INT_AS_LONGLONG(a) (((long long)a) & 0xFFFFFFFF)
#define LL_SETR_EPI8(a, b, c, d, e, f, g, h) \
CHAR_AS_LONGLONG(a) | (CHAR_AS_LONGLONG(b) << 8) | \
(CHAR_AS_LONGLONG(c) << 16) | (CHAR_AS_LONGLONG(d) << 24) | \
(CHAR_AS_LONGLONG(e) << 32) | (CHAR_AS_LONGLONG(f) << 40) | \
(CHAR_AS_LONGLONG(g) << 48) | (CHAR_AS_LONGLONG(h) << 56)
#define LL_SETR_EPI16(a, b, c, d) \
SHORT_AS_LONGLONG(a) | (SHORT_AS_LONGLONG(b) << 16) | \
(SHORT_AS_LONGLONG(c) << 32) | (SHORT_AS_LONGLONG(d) << 48)
#define LL_SETR_EPI32(a, b) \
INT_AS_LONGLONG(a) | (INT_AS_LONGLONG(b) << 32)
#define _MM_SETR_EPI8(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, aa, ab, ac, ad, ae, af) \
{LL_SETR_EPI8(a0, a1, a2, a3, a4, a5, a6, a7), LL_SETR_EPI8(a8, a9, aa, ab, ac, ad, ae, af)}
#define _MM_SETR_EPI16(a0, a1, a2, a3, a4, a5, a6, a7) \
{LL_SETR_EPI16(a0, a1, a2, a3), LL_SETR_EPI16(a4, a5, a6, a7)}
#define _MM_SETR_EPI32(a0, a1, a2, a3) \
{LL_SETR_EPI32(a0, a1), LL_SETR_EPI32(a2, a3)}
So in your code initialization of __m128i constant will be look like:
const __m128i K8 = _MM_SETR_EPI8(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16);
const __m128i K16 = _MM_SETR_EPI16(1, 2, 3, 4, 5, 6, 7, 8);
const __m128i K32 = _MM_SETR_EPI32(1, 2, 3, 4);
I suggest defining the initialisation data globally as scalar data and then load it locally into a const __m128i
:
static const uint8_t gK8[16] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 };
static inline foo()
{
const __m128i K8 = _mm_loadu_si128((__m128i *)gK8);
// ...
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With