It seems there is no intrinsic for bitwise NOT/complement in AVX2. Did I miss it, or are we supposed to do something like _mm256_xor_si256(a, _mm256_set1_epi64x(-1LL))
? If the latter, is it optimal? Is there no vector NOT instruction in assembly either?
Yes, the only SIMD bitwise NOT is PXOR/XORPS with all-ones, in MMX, SSE*, and AVX1/2.
AVX512F can avoid the need for a separate vector constant using vpternlogd same,same,same
, with the immediate 0x55
. (See my answer on the duplicate for more details about it vs. vpxord
: Is NOT missing from SSE, AVX?)
Ideally you can arrange your algorithm to avoid actually needing to NOT something. For example, using PANDN
instead of PAND
. Or invert later as part of something else. But if you do end up needing to invert, that's how.
The all-ones constant can be generated with vpcmpeqd same,same,same
. With intrinsics, let the compiler do this for you by writing _mm256_set1_epi32(-1)
. (Element size is obviously irrelevant for set1(-1)
, use whatever makes semantic sense for your algorithm.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With