Usage of global vs. constant memory in CUDA

Question

Hey there, I have the following piece of code:

#if USE_CONST == 1
    __constant__ double PNT[ SIZE ];    
#else
    __device__ double *PNT;
#endif

and a bit later I have:

#if USE_CONST == 0
    cudaMalloc((void **)&PNT, sizeof(double)*SIZE);
    cudaMemcpy(PNT, point, sizeof(double)*SIZE, cudaMemcpyHostToDevice);
#else
    cudaMemcpyToSymbol(PNT, point, sizeof(double)*SIZE);
#endif

whereas point is somewhere defined in the code before. When working with USE_CONST=1 everything works as expected, but when working without it, than it doesn't. I access the array in my kernel-function via

PNT[ index ]

Where's the problem between the both variants? Thanks!

talonmies · Accepted Answer

The correct usage of cudaMemcpyToSymbol prior to CUDA 4.0 is:

cudaMemcpyToSymbol("PNT", point, sizeof(double)*SIZE)

or alternatively:

double *cpnt;
cudaGetSymbolAddress((void **)&cpnt, "PNT");
cudaMemcpy(cpnt, point, sizeof(double)*SIZE, cudaMemcpyHostToDevice);

which might be a bit faster if you are planning to access the symbol from the host API more than once.

EDIT: misunderstood the question. For the global memory version, do something similar to the second version for constant memory

double *gpnt;
cudaGetSymbolAddress((void **)&gpnt, "PNT");
cudaMemcpy(gpnt, point, sizeof(double)*SIZE.  cudaMemcpyHostToDevice););

Usage of global vs. constant memory in CUDA

Tags:

memory

cuda

tim

1 Answers

talonmies

Recent Activity

Donate For Us

Usage of global vs. constant memory in CUDA

Tags:

memory

cuda

tim

1 Answers

talonmies

Related questions

Recent Activity

Donate For Us