In Chandler Carruth's CppCon 2015 talk he introduces two magical functions for defeating the optimizer without any extra performance penalties.
For reference, here are the functions (using GNU-style inline assembly):
void escape(void* p)
{
asm volatile("" : : "g"(p) : "memory");
}
void clobber()
{
asm volatile("" : : : "memory");
}
It works on any compiler which supports GNU-style inline assembly (GCC, Clang, Intel's compiler, possibly others). However, he mentions it doesn't work in MSVC.
Examining Google Benchmark's implementation, it seems they use a reinterpret cast to a volatile const char&
and passes it to a function hidden in a different translation unit on non-gcc/clang compilers.
template <class Tp>
inline BENCHMARK_ALWAYS_INLINE void DoNotOptimize(Tp const& value) {
internal::UseCharPointer(&reinterpret_cast<char const volatile&>(value));
}
// some other translation unit
void UseCharPointer(char const volatile*) {}
However, there are two concerns I have with this:
Is there any lower-level equivalent in MSVC to the GNU-style assembly functions? Or is this the best it gets on MSVC?
While I don't know of an equivalent assembly trick for MSVC, Facebook uses the following in their Folly benchmark library:
/**
* Call doNotOptimizeAway(var) against variables that you use for
* benchmarking but otherwise are useless. The compiler tends to do a
* good job at eliminating unused variables, and this function fools
* it into thinking var is in fact needed.
*/
#ifdef _MSC_VER
#pragma optimize("", off)
template <class T>
void doNotOptimizeAway(T&& datum) {
datum = datum;
}
#pragma optimize("", on)
#elif defined(__clang__)
template <class T>
__attribute__((__optnone__)) void doNotOptimizeAway(T&& /* datum */) {}
#else
template <class T>
void doNotOptimizeAway(T&& datum) {
asm volatile("" : "+r" (datum));
}
#endif
Here is a link to code on GitHub.
I was looking for a way to achieve the exact same thing in my own little benchmark lib. The frustrating thing about MSVC is that targeting x64 disallows the __asm trick while x86 allows it!
After some tries I reused google's solution without incurring additional call! The nice thing is that the solution works with both MSVC(/Ox) and GCC(-O3).
template <class T>
inline auto doNotOptimizeAway(T const& datum) {
return reinterpret_cast<char const volatile&>(datum);
}
At the call site I simply do no use the volatile returned!
int main()
{
int a{10};
doNotOptimizeAway(a);
return 0;
}
Generated ASM (Compiler Explorer)
a$ = 8
main PROC
mov DWORD PTR a$[rsp], 10
movzx eax, BYTE PTR a$[rsp]
xor eax, eax
ret 0
main ENDP
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With