Does anybody know how to check whether the code is running on the GPU or CPU using Cuda?
__device__ __host__ double count_something(double variable) {
if (RUN_ON_GPU) {
use_cuda_variables();
} else {
use_cpu_variables();
}
}
There is no way to runtime check which architecture a piece of code is running on, but there is also no need to know, because it can be determined at compile time and handled accordingly. nvcc
defines several preprocessor symbols which can be used to parse the compilation trajectory while code is being compiled. The key symbol is __CUDA_ARCH__
which is never defined when compiling host code and always defined when compiling device code.
So it is possible to write a function like this:
__device__ __host__ float function(float x)
{
#ifdef __CUDA_ARCH__
return 10.0f * __sinf(x);
#else
return 10.0f * sin(x);
#endif
}
which will emit different code depending on whether it is compiled for the GPU or host. You can read a more thorough discussion about compilation steering in this Stack Overflow question or in the C language extensions section of the CUDA programming guide.
I can't add proper code markdown in comments - decided to add full answer.
Using only __CUDA_ARCH__
define checking is not fully correct. In some cases this code doesn't work - I've spent a lot of time in debugging before I found solution (CUDA documentation haven't any mention about it now).__CUDA_ARCH__
can be defined even in host code, but it is defined to 0 in such case.
Thus proper checking is something like this:
__device__ __host__ float function(float x)
{
#if (defined(__CUDA_ARCH__) && (__CUDA_ARCH__ > 0))
// device code here
return 10.0f * __sinf(x);
#else
// host code here
return 10.0f * sin(x);
#endif
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With