If I compile a CUDA program with a lower Compute Capability, e.g 1.3 (nvcc flag sm_13), and run it on a device with Compute Capability 2.1, will it exploit the features of Compute 2.1 or not?
In that situation, Will the compute 2.1 device behave like a compute 1.3 device?
No, it won't exploit any features you need to explicitly program for. Only those features that are transparent to the user (like cache or larger register files) will be used.
Additionally, you need to make sure your object file contains a version of the code compiled to the PTX intermediate language, that can be dynamically compiled to the target architecture, or you program will not even run.
Compile to a virtual architecture (nvcc -arch compute_13
) to ensure that, or create a fat binary with code for multiple architectures using the -gencode
option to nvcc.
With a fat binary, you can program for features available only on higher compute capability if you wrap the code inside #if __CUDA_ARCH__ >= xyz
preprocessor conditionals.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With