What is the best practice to write a cross platform implementation of the x86 pause instruction? I am planning to use it in a busy spinning loop in a C++ 11 project.
If I was only using the gcc tool-chain then I could use the _mm_pause intrinsic. Does this intrinsic do the right thing even when the native processor does not support the x86 pause instruction? I would also like my code to work on the clang/llvm tool-chain too.
I imagine that a fallback could use "std::this_thread::sleep_for" since I am using C++ 11. But I am not sure how to detect processor capability (supports pause vs does not) and fall back to sleep.
I am using cmake to build my project and also will always build and deploy on the same machine. So I am comfortable detecting processor settings during compilation.
An example implementation (pseudocode) is :
void pause() {
// Not sure how to detect if pause is available on the platform.
#if defined(USE_mm_pause)
__asm__ ( "pause;" );
#else
std::this_thread::sleep_for(std::chrono::seconds(0));
#endif
}
Does this intrinsic do the right thing even when the native processor does not support the x86 pause instruction?
Yes, the pause instruction is encoded as F3 90. A pre Pentium 4 processor which does not know about this instruction will decode it as:
REP NOP
That is just a ordinary NOP
with a useless prefix-byte. The processor will wait a cycle or two and then continue without altering the processor state in any way. You will not get the performance and power benefits from using PAUSE
but the program will still work as expected.
Fun fact: REP NOP
was even legal on the 8086 released roughly 35 years ago. That's what I call backward compatibility.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With