How can I create a library that will dynamically switch between SSE, AVX, and AVX2 code paths depending on the host processor/OS? I am using Agner Fog's VCL (Vector Class Library) and compiling with GCC for Linux.
See the section "Instruction sets and CPU dispatching" in the manual to the Vector Class Library. In that section Agner writes
The file dispatch_example.cpp shows an example of how to make a CPU dispatcher that selects the appropriate code version.
Read the source code to distpatch_example.cpp
. At the start of the file you should see the comment
# Compile dispatch_example.cpp five times for different instruction sets:
| g++ -O3 -msse2 -c dispatch_example.cpp -od2.o
| g++ -O3 -msse4.1 -c dispatch_example.cpp -od5.o
| g++ -O3 -mavx -c dispatch_example.cpp -od7.o
| g++ -O3 -mavx2 -c dispatch_example.cpp -od8.o
| g++ -O3 -mavx512f -c dispatch_example.cpp -od9.o
| g++ -O3 -msse2 -otest instrset_detect.cpp d2.o d5.o d7.o d8.o d9.o
| ./test
The file instrset_detect.cpp
. You should read the source code to this also. This is what calls CPUID.
Here is a summary of some, but not all of, my questions and answers on CPU dispatchers.
The assembly instruction cpuid
can give you this information at runtime. Someone has helpfully created a library based on this to just what you need.
You could create a function dispatch table, and populate it with the correct code path functions based on the results of querying using this code.
UPDATE: (answer to question in comments)
To create the different code paths in the first place, you need to compile the different code paths separately, and then link them together. For each one, you specify the architecture needed by using various values of the -march
switch in your compile line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With