It appears that starting with the most recent update Visual Studio 2017 (15.5) it generates code using AVX extension (for x64 build that is) even though "Enable Enhanced Instruction Set" is set to "Not Set", which, according to the tooltip should only allow SSE2 instructions. Trying to set it to either /arch:SSE2 or /arch:IA32 leads to a compiler warning "ignoring unknown option '/arch:IA32'" (or SSE2 accordingly), which according to https://connect.microsoft.com/VisualStudio/feedback/details/1217151 is an expected behavior. So is there any way now to make the compiler not generate AVX specific code?
This has been fixed in 15.7
Yes, we will add an option to enable AVX2 in the drop-down menu at: Project Property Pages | Configuration Properties | C/C++ | Code Generation | Enable Enhanced Instruction Set.
It allows vectors of either 128 bits or 256 bits, and zero-extends all vector results to the full vector size. (For legacy compatibility, SSE-style vector instructions preserve all bits beyond bit 127.) Most floating-point operations are extended to 256 bits.
During normal conditions, enabled by default loop auto-vectorizer can also use extended set of instructions (for example AVX even when arch explicitly set to SSE2).
But how should it work then, if the cpu doesn't support AVX? Compiler inserts special runtime ISA check (via __isa_available?) for enhanced instruction set support and choose code path with supported instructions on demand. Looks like it was done similar to SSE4.2 instructions emission for modern cpus even when arch is SSE2.
In the last update (15.5) auto-vectorization was broken at least in x86 / x64 builds. Compiler doesn't insert runtime ISA check and emits AVX instruction during loop vectorization (in my case it was vpermilps
).
Temporary solutions:
As i've suggested in a workaround, you can remove auto-vectorization for a selected loop with:
#pragma loop(no_vector)
for / while / do while ...
Unfortunately, it's a fast hack, since potentially every loop can be vectorized, and it is unpractical to insert such pragma everywhere. Of course, you can also get performance drop.
Another temporary solution is to try /d2Qvec-sse2only internal compiler switch to use only SSE2 during auto-vectorization (at least, it should work with Visual Studio 2013). This switch is undocumented and can be changed without notice.
Update: As mentioned by Cheney Wang, bug is sent to C++ team, so you can track its status in community item.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With