Some new Intel processors have both RDTSC
and RDTSCP
instructions while most older processors have only RDTSC
instruction.
While coding in C/C++, how I can detect at compile time if the architecture being used have RDTSCP
instruction or not?
I know we can check this out manually by browsing CPU info (e.g., cat /proc/cpuinfo) and then adjusting our code. But getting this information at compile time (as a macro or flag value) would really omit the need to manually checking and editing the code.
GCC defines many macros to determine at compile-time whether a particular feature is supported by the microarchitecture specified using -march
. You can find the full list in the source code here. It's clear that GCC does not define such a macro for RDTSCP
(or even RDTSC
for that matter). The processors that support RDTSCP
are listed in: What is the gcc cpu-type that includes support for RDTSCP?.
So you can make your own (potentially incomplete) list microarchitectures that support RDTSCP
. Then write a build script that checks the argument passed to -march
and see if it is in the list. If it is, then define a macro such as __RDTSCP__
and use it in your code. I presume that even if your list is incomplete, this should not compromise the correctness of your code.
Unfortunately, the Intel datasheets do not seem to specify whether a particular processor supports RDTSCP
even though they discuss other features such as AVX2.
One potential problem here is that there is no guarantee that every single processor that implements a particular microarchitecture, such as Skylake, supports RDTSCP
. I'm not aware of such exceptions though.
Related: What is the gcc cpu-type that includes support for RDTSCP?.
To determine RDTSCP support at run-time, the following code can be used on compilers supporting GNU extensions (GCC, clang, ICC), on any x86 OS. cpuid.h
comes with the compiler, not the OS.
#include <cpuid.h>
int rdtscp_supported(void) {
unsigned a, b, c, d;
if (__get_cpuid(0x80000001, &a, &b, &c, &d) && (d & (1<<27)))
{
// RDTSCP is supported.
return 1;
}
else
{
// RDTSCP is not supported.
return 0;
}
}
__get_cpuid()
runs CPUID twice: once to check max level, once with the specified leaf value. It returns false if the requested level isn't even available, that's why it's part of a &&
expression. You probably don't want to use this every time before rdtscp, just as an initializer for a variable unless it's just a simple one-off program. See it on the Godbolt compiler explorer.
For MSVC, see How to detect rdtscp support in Visual C++? for code using its intrinsic.
For some CPU features that GCC does know about, you can use __builtin_cpu_supports
to check a feature bitmap that's initialized early in startup.
// unfortunately no equivalent for RDTSCP
int sse42_supported() {
return __builtin_cpu_supports("sse4.2");
}
Editor's note: https://gcc.gnu.org/wiki/DontUseInlineAsm. This answer for a long time was unsafe, and later edited to not even compile while still being unsafe (clobbering RAX making the "a"
constraint unsatisfiable, while still missing clobbers on registers that CPUID writes). Use the intrinsics in another answer. (But I've fixed the inline asm in this to be safe and correct, in case anyone does copy/paste it, or wants to learn how to use constraints and clobbers properly.)
After investigating a little more based on the suggestions made by @Jason, I have now a run-time solution (still not a compile-time one) to determine if RDTSCP
exists by checking the 28th bit (see output bitmap) of the cpuid
instruction with 0x80000001
as input in EAX
.
int if_rdtscp() {
unsigned int edx;
unsigned int eax = 0x80000001;
#ifdef __GNUC__ // GNU extended asm supported
__asm__ ( // doesn't need to be volatile: same EAX input -> same outputs
"CPUID\n\t"
: "+a" (eax), // CPUID writes EAX, but we can't declare a clobber on an input-only operand.
"=d" (edx)
: // no read-only inputs
: "ecx", "ebx"); // CPUID writes E[ABCD]X, declare clobbers
// a clobber on ECX covers the whole RCX, so this code is safe in 64-bit mode but is portable to either.
#else // Non-gcc/g++ compilers.
// To-do when needed
#endif
return (edx >> 27) & 0x1;
}
If this doesn't work in 32-bit PIC code because of the EBX clobber, then 1. stop using 32-bit PIC because it's inefficient vs. 64-bit PIC or vs. -fno-pie -no-pie
executables. 2. get a newer GCC that allows EBX clobbers even in 32-bit PIC code, emitting extra instructions to save/restore EBX or whatever is needed. 3. use the intrinsics version (which should work around this for you).
For now I am fine with GNU compilers, but if somebody need do this under MSVC, then is an intrinsic way to check this as explained here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With