In one of my applications, I need to efficiently de-interleave bits in a long stream of data. Ideally, I would like to use the BMI2 pext_u32()
and/or pext_u64()
x86_64 intrinsic instructions when available. I scoured the internet for doc on x86intrin.h
(GCC), but couldn't find much on the subject; so, I am asking the gurus on StackOverflow to help me out.
x86intrin.h
?pext_*()
already have code behind it to fall back on, or do I need to write the fallback code myself (for conditional compile)?pext_*()
when compiling with optimization enabled and with -mbmi2
?Intel publishes the Intrinsics Guide, which also applies to GCC. You will have to write your own fallback code if you use these intrinsics.
You can achieve automatic switching of implementations by using IFUNC resolvers, but for non-library code, using conditionals or function pointers is probably simpler.
Looking at the gcc/config/i386/i386.md
and gcc/config/i386/i386.c
files, I don't see anything in GCC 8 which would automatically select the pext
instruction without intrinsics in the source code.
The design philosophy of Intel's intrinsics is that you can only use them in functions that will run only on CPUs with the required extensions. Checking for support every instruction would add way too much overhead, and then there's have to be a fallback (there isn't).
Intel intrinsics are not like GNU C __builtin_popcountll
(which does use a fallback if compiled without -mpopcnt
, but not you can enable target options on a per-function basis with attributes.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With