I am trying to use Nsight to debug the following code:
__device__ void change(int shared[])
{
if(threadIdx.x<10)
shared[threadIdx.x]=threadIdx.x;
}
__global__ void MyK()
{
int shared[10];
change(shared);
__syncthreads();
}
I am calling my kernel in the main method like this :
cudaSetDevice(1);
MyK<<<1,20>>>();
When I put a breakpoint before change(shared), I can see that the shared array is created and its values are set to 0. However, if I put the breakpoint after __syncthreads(), the debugger shows the following error:
cannot resolve name shared
Can't I pass my shared array to another device function?
The reason why you see the "Cannot resolve name shared" in the memory watch window is because shared array is being optimized out by the compiler since it is not being used at all by any part of your kernel after change(shared). Like @user586831 mentioned earlier, try outputing the value as your return value for your device function.
Also on another note, not sure if you really meant __shared__ array or referring to the array by its name shared. Anyway you're not using shared memory in your code above. int shared is just a normal integer array type. You need to specify the __shared__ qualifier in order to declare shared memory. E.g.
extern __shared__ int shared[10];
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With