I'm using nsight 2.2, Toolkit 4.2, latest nvidia driver, I'm using couple GPUs in my computer. Build customize 4.2. I have set "generate GPU ouput" on CUDA's project properties, nsight monitor is on (everything looks great).
I set several break points on my global - kernel function. nsight stops at the declaration of the function, but skips over several break points. It's just like nsight decide whether to hit a break point or skip over a break point. The funny thing is that nsight stops at for loops, but doesn't stop on simple assignment operations.
One more problem is that I can't set focus or add variables to the watch list. In this case (see attached screenshot) I can't resolve the value of variable : "posss" or "testDetctoinRate1" which are registers in this case. On the other hand, shared memory or block memory would insert automatically to the local's list.
Here is a screen shot of the kernel, before debugging
Here is a screen shot during debugging
I evoke my kernel function with following call:
checkCUDA<<<1, 32>>>(sumMat->rows,sumMat->cols , (UINT *)pGPUsumMat);
cudaError = cudaGetLastError();
if(cudaError != cudaSuccess)
{
printf("CUDA error: %s\n", cudaGetErrorString(cudaError));
exit(-1);
}
kernel call works without an error.
Is there any option to forcing nsight stops at all breakpoints? How can I add thread's registers to my watch list?
Initially, my debug command line is as follows:
# Runtime API (NVCC Compilation Type is hybrid object or .c file)
set CUDAFE_FLAGS=--sdk_dir "c:\Program Files\Microsoft SDKs\Windows\v7.0A\"
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\bin\nvcc.exe" --use-local-env --cl-version 2010 -ccbin "C:\Program Files\Microsoft Visual Studio 10.0\VC\bin" -I"..\..\..\opencv\modules\gpu\src\opencv2\gpu\device" -I"..\..\..\opencv\modules\gpu\include\opencv2\gpu" -I"..\..\..\build\include\\" -G --keep-dir "Debug" -maxrregcount=0 --machine 32 --compile -g -Xcompiler "/EHsc /nologo /Od /Zi /MDd " -o "Debug\%(Filename)%(Extension).obj" "%(FullPath)"
I changed on property page --> cuda --> host --> generate hosting debug information --> No
Now my command line doesn't contain the -g and -O letters , my command line is as followed:
# Runtime API (NVCC Compilation Type is hybrid object or .c file)
set CUDAFE_FLAGS=--sdk_dir "c:\Program Files\Microsoft SDKs\Windows\v7.0A\"
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\bin\nvcc.exe" --use-local-env --cl-version 2010 -ccbin "C:\Program Files\Microsoft Visual Studio 10.0\VC\bin" -I"..\..\..\opencv\modules\gpu\src\opencv2\gpu\device" -I"..\..\..\opencv\modules\gpu\include\opencv2\gpu" -I"..\..\..\build\include\\" -G --keep-dir "Debug" -maxrregcount=0 --machine 32 --compile -Xcompiler "/EHsc /nologo /Od /Zi /MDd " -o "Debug\%(Filename)%(Extension).obj" "%(FullPath)"
Although, I do debug with -o
, does it matter? It doesn't make any change.
Right click the .cu file in the Solution Explorer, then go to CUDA C/C++ | Device
and set Generate GPU Debug Information
to Yes (-G0)
.
Check whether "Enable CUDA Memory Checker" under Nsight is turned off or not. It may allow NSight to stop breakpoints of your CUDA kernel code on Debug mode of VS C++ 2010. At least, it works for me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With