I am trying to use SSE2 to unpack text with zeros, and extend that to AVX2. Here's what I mean:
Suppose you have some text like this: abcd
I'm trying to use SSE2 to unpack abcd
into a\0b\0c\0d
. The \0
's are zeros. This of course being applied to 16 characters instead of 4.
I was able to do that using this code (ignore the C-Style casts):
__m128i chunk = _mm_loadu_si128((__m128i const*) src); // Load 16 bytes from memory
__m128i half = _mm_unpacklo_epi8(chunk, _mm_setzero_si128()); // Unpack lower 8 bytes with zeros
_mm_storeu_si128((__m128i*) dst, half); // Write to destination
half = _mm_unpackhi_epi8(chunk, _mm_setzero_si128()); // Unpack higher 8 bytes with zeros
_mm_storeu_si128((__m128i*) (dst + 16), half); // Write to destination
This works great, but I'm trying to convert the code into AVX2, so I can process 32 bytes at a time. However, I'm having trouble with unpacking the low bytes.
Here is the code I'm using for AVX2:
__m256i chunk = _mm256_loadu_si256((__m256i const*) src); // Load 32 bytes from memory
__m256i half = _mm256_unpacklo_epi8(chunk, _mm256_setzero_si256()); // Unpack lower 16 bytes with zeros
_mm256_storeu_si256((__m256i*) dst, half); // Write to destination
half = _mm256_unpackhi_epi8(chunk, _mm256_setzero_si256()); // Unpack higher 16 bytes with zeros
_mm256_storeu_si256((__m256i*) (dst + 32), half); // Write to destination
The problem is, the _mm256_unpacklo_epi8
instruction seems to be skipping 8 bytes for every 8 bytes it converts. For example this text (the "fr" at the end is intended):
Permission is hereby granted, fr
Gets converted into
Permissireby graon is hented, fr
Every 8 bytes _mm256_unpacklo_epi8
, processes, 8 bytes get skipped.
What am I doing wrong here? Any help would be greatly appreciated.
As I can see the right answer already has been received from @PeterCordes. Nevertheless I want to supplement it with small helper function:
template <int part> inline __m256i Cvt8uTo16u(__m256i a)
{
return _mm256_cvtepu8_epi16(_mm256_extractf128_si256(a, part));
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With