since last week I got a big problem with my CUDA-development setup. I have an integrated GPU which I attached my monitors too and an extra NVIDIA Card for running my CUDA kernels on. However, i can not debug my code anymore, because it says:
fatal: All CUDA devices are used for display and cannot be used while debugging. (error code = CUDBG_ERROR_ALL_DEVICES_WATCHDOGGED(0x18)
Somehow it seems that my X-Server is blocking my NVIDIA GPU because if I switch to another virtual console (CTRL+ALT+F1) I am able to run my code using cuda-gdb. No monitor cable is plugged into the NVIDIA-card...
"lsof /dev/nvidia*" does not give any output. I am using Xubuntu 14.04.
Does anyone have an idea how to solve this problem?
In devices with compute capability of at least SM35, we can apparently get around this by setting the environment variable
CUDA_DEBUGGER_SOFTWARE_PREEMPTION=1
We can see it at the cuda-gdb documentation page:http://docs.nvidia.com/cuda/cuda-gdb/#axzz4BrMPoaoW
Here's test. I am running on a Maxwell Quadro GPU:
nvidia-smi
Fri Jun 17 10:59:47 2016
+------------------------------------------------------+
| NVIDIA-SMI 352.63 Driver Version: 352.63 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro M4000M Off | 0000:01:00.0 On | N/A |
| N/A 37C P8 9W / 100W | 158MiB / 4087MiB | 1% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 2981 G /usr/bin/X 57MiB |
| 0 9186 G ...ves-passed-by-fd --v8-snapshot-passed-by- 85MiB |
+-----------------------------------------------------------------------------+
Build and run the application
nvcc -g -G foo.cu
cuda-gdb ./a.out
...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
fatal: All CUDA devices are used for display and cannot be used while debugging. (error code = CUDBG_ERROR_ALL_DEVICES_WATCHDOGGED(0x18)
Now set the environment variable.
export CUDA_DEBUGGER_SOFTWARE_PREEMPTION=1
cuda-gdb ./a.out
(cuda-gdb) r
...
warning: Cuda API error detected: cudaMemcpy returned (0xb)
warning: Cuda API error detected: cudaFree returned (0x11)
[Thread 0x7fffed3ff700 (LWP 10302) exited]
[Thread 0x7ffff7fc6780 (LWP 10293) exited]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With