I'm having trouble trying to assign a value to a device variable and then copying this to a host variable.
I start with d_test and h_test = 0.0. I have a simple kernel to set the device variable, d_test, to 1.0. I then copy this to the host variable h_test and print. The problem is that when I print I get h_test = 0.0. What am I doing wrong? Here's the code:
// -*- mode: C -*-
#include <stdio.h>
#include <stdlib.h>
#include <cuda_runtime.h>
// device variable and kernel
__device__ float d_test;
__global__ void kernel1(float d_test) { d_test = 1.0; }
int main() {
// initialise variables
float h_test = 0.0;
cudaMemset(&d_test,0,sizeof(float));
// invoke kernel
kernel1 <<<1,1>>> (d_test);
// Copy device variable to host and print
cudaMemcpy(&h_test,&d_test,sizeof(float),cudaMemcpyDeviceToHost);
printf("%f\n",h_test);
}
There are a couple of issues with your code.
As pezcode correctly notes, kernel1
's parameter d_test
shadows your global variable, so when it assigns to d_test
, it is actually changing the value of its parameter, instead of the global variable as you intend. kernel1
need not take an argument in this example.
Instead of cudaMemcpy
, use cudaMemcpyFromSymbol
when copying from a global __device__
variable.
Here's the full solution:
// -*- mode: C -*-
#include <stdio.h>
#include <stdlib.h>
#include <cuda_runtime.h>
// device variable and kernel
__device__ float d_test;
__global__ void kernel1() { d_test = 1.0; }
int main() {
// initialise variables
float h_test = 0.0;
cudaMemset(&d_test,0,sizeof(float));
// invoke kernel
kernel1 <<<1,1>>> ();
// Copy device variable to host and print
cudaMemcpyFromSymbol(&h_test, d_test, sizeof(float), 0, cudaMemcpyDeviceToHost);
// or cudaMemcpyFromSymbol(&h_test, "d_test", sizeof(float), 0, cudaMemcpyDeviceToHost);
// until CUDA 5.0
printf("%f\n",h_test);
}
And the output:
$ nvcc test.cu -run
1.000000
UPDATE 2019 nov
CUDA Toolkit documentation states that "use of a string naming a variable as the symbol parameter was deprecated in CUDA 4.1 and removed in CUDA 5.0."
cudaMemcpyFromSymbol worked for me after struggling quite a while with cudaMemcpy and getting the wrong values back. I had to remove the quotes around "d_test" and got an invalid argument error with the cudaMemset, so just used the cudaMalloc which was previously in the code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With