Is there any difference and what is the best way to define device constants in a CUDA program? In the C++, host/device program if I want to define constants to be in device constant memory I can do either
__device__ __constant__ float a = 5;
__constant__ float a = 5;
Question 1. On devices 2.x and CUDA 4, is it the same as,
__device__ const float a = 5;
Question 2. Why is it that in PyCUDA SourceModule("""..."""), which compiles only do device code, even the following works?
const float a = 5;
In CUDA __constant__
is a variable type qualifier that indicates the variable being declared is to be stored in device constant memory. Quoting section B 2.2 of the CUDA programming guide
The
__constant__
qualifier, optionally used together with__device__
, declares a variable that:
- Resides in constant memory space,
- Has the lifetime of an application,
- Is accessible from all the threads within the grid and from the host through the runtime library (
cudaGetSymbolAddress()
/cudaGetSymbolSize()
/cudaMemcpyToSymbol()
/cudaMemcpyFromSymbol()
for the runtime API andcuModuleGetGlobal()
for the driver API).
In CUDA, constant memory is a dedicated, static, global memory area accessed via a cache (there are a dedicated set of PTX load instructions for its purpose) which are uniform and read-only for all threads in a running kernel. But the contents of constant memory can be modified at runtime through the use of the host side APIs quoted above. This is different from declaring a variable to the compiler using the const
declaration, which is adding a read-only characteristic to a variable at the scope of the declaration. The two are not at all the same thing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With