Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the differences between the compress and expand instructions in AVX-512?

I was studying the expand and compress operations from the Intel intrinsics guide. I'm confused about these two concepts:

For __m128d _mm_mask_expand_pd (__m128d src, __mmask8 k, __m128d a) == vexpandpd

Load contiguous active double-precision (64-bit) floating-point elements from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).

For __m128d _mm_mask_compress_pd (__m128d src, __mmask8 k, __m128d a) == vcompresspd

Contiguously store the active double-precision (64-bit) floating-point elements in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.

Is there any clearer description or anyone who can explain more?

like image 953
Hossein Amiri Avatar asked Jul 09 '18 09:07

Hossein Amiri


1 Answers

These instructions implement the APL operators \ (expand) and / (compress). Expand takes a bit mask α of some mn bits of which n are set and an array ω of n numbers and returns a vector of m numbers with the numbers from ω inserted into the places indicated by α and the rest set to zero. For example,

0 1 1 0 1 0 \ 2 3 4

returns

0 2 3 0 4 0

The _mm_mask_expand_pd instruction implements this operator for fixed m = 8.

The compress operation undos the effect of the expand operation, i.e. it uses a bit mask α to select entries from ω and stores these entries contiguously to memory.

like image 58
fuz Avatar answered Oct 06 '22 00:10

fuz