I've got an application that requires AVX2 to work correctly. A check was implemented to check during application start if CPU has AVX2 instruction. I would like to check if it works correctly, but i only have CPU that has AVX2. Is there a way to temporarly turn it off for testing purposes? Or to somehow emulate other CPU?
Yes, use an "emulation" (or dynamic recompilation) layer like Intel's Software Development Emulator (SDE), or maybe QEMU.
SDE is closed-source freeware, and very handy for both testing AVX512 code on old CPUs, or for simulating old CPUs to check that you don't accidentally execute instructions that are too new.
Example: I happened to have a binary that unconditionally uses an AVX2 vpmovzxwq
load instruction (for a function I was testing). It runs fine on my Skylake CPU natively, but SDE has a -snb
option to emulate a Sandybridge in both CPUID and actually checking every instruction.
$ sde64 -snb -- ./mask
TID 0 SDE-ERROR: Executed instruction not valid for specified chip (SANDYBRIDGE): 0x401005: vpmovzxwq ymm2, qword ptr [rip+0xff2]
Image: /tmp/mask+0x5 (in multi-region image, region# 1)
Instruction bytes are: c4 e2 7d 34 15 f2 0f 00 00
There are options to emulate CPUs as old as -quark
, -p4
(SSE2), or Core 2 Merom (-mrm
), to as new as IceLake-Server (-icx
) or Tremont (-tnt
). (And Xeon Phi CPUs like KNL and KNM.)
It runs pretty quickly, using dynamic recompilation (JIT) so code using only instructions that are supported natively can run at basically native speed, I think.
It also has instrumentation options (like -mix
to dump the instruction mix), and options to control the JIT more closely. I think you could maybe get it to not report AVX2 in CPUID, but still let AVX2 instructions run without faulting.
Or probably emulate a CPU that supports AVX2 but not FMA (there is a real CPU like this from Via, unfortunately). Or combinations that no real CPU has, like AVX2 but not popcnt
, or BMI1/BMI2 but not AVX. But I haven't looked into how to do that.
The basic sde -help
options only let you set it to specific Intel CPUs, and for checking for potentially-slow SSE/AVX transitions (without correct vzeroupper usage). And a few other things.
One important test-case that SDE is missing is AVX+FMA without AVX2 (AMD Piledriver / Steamroller, i.e. most AMD FX-series CPUs). It's easy to forget and use an AVX2 shuffle in code that's supposed to be AVX1+FMA3, and some compilers (like MSVC) won't catch this at compile time the way gcc -march=bdver2
would. (Bulldozer only has AVX + FMA4, not FMA3, because Intel changed their plans after it was too late for AMD to redesign.)
If you just want CPUID not report the presence of AVX2 (and FMA?) so your code uses its AVX1 or non-AVX versions of functions, you can do that with most VMs.
For AVX instructions to run without faulting, a bit in a control register has to be set. (So this works like a promise by the OS that it will correctly save/restore the new architectural state of YMM upper halves). So disabling AVX in CPUID will give you a VM instance where AVX instructions fault. (At least 256-bit instructions? I haven't tried this to see if 128-bit AVX instructions can still execute in this state on HW that supports AVX.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With