Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Resetting GPU and driver after CUDA error

Tags:

Sometimes, bugs in my CUDA programs cause the desktop graphics to break (in Windows). Typically, the screen remains somewhat readable, but when graphics change, such as when dragging a window, lots of semi-random colored pixels and small blocks appear.

I have tried to reset the GPU and driver by changing the desktop resolution, but that doesn't help. The only fix I have found is to reboot the computer.

Is there a program out there or some trick I can use to get the driver and GPU to reset without rebooting?

Background:

I have had 1.0, 1.1, 1.3 and 2.0 cards but I only have a 1.1 and 2.0 card now. I've seen the issue on 1.0 and 1.1. I'm pretty sure I've seen it on 1.3. I'm unsure about 2.0. Did memory protection get added some time around 1.3? I am almost sure it's not due to unstable hardware as the problems have seemed to be triggered by bugs in my code and have disappeared when the bugs were fixed. When running finished code, the cards have been stable. I wrote this question after seeing it on my 1.1 card, but it disappeared after I fixed a bug and now I don't have any code that reproduces it. Maybe I should try to write to random locations on the 1.1 card and see if anything happens...

like image 634
Roger Dahl Avatar asked Jun 03 '12 15:06

Roger Dahl


People also ask

How do I restart my NVIDIA driver Cuda without rebooting?

Replacing the nvidia driver itself can indeed be done without reboot with "sudo rmmod nvidia" & "sudo nvidia-smi". You should anyway make sure that no current cuda processes are running. Contrary to popular belief it is very well possible to install multiple compilers and cuda versions on the same machine.

How do I force a GPU reset?

You may use the shortcut – Windows key + Ctrl + Shift + B keys simultaneously on your device. This will restart your graphics card.

How do I restart my NVIDIA graphics card?

There is a simple shortcut that you can use to restart your drivers. Press Win+Ctrl+Shift+B.


1 Answers

Because the same problem occurs sometimes on unix and google forwarded me to this thread, I hope this helps somebody else..

On ubuntu unloading and reloading the nvidia kernel module solved the problem for me:

sudo rmmod nvidia_uvm
sudo modprobe nvidia_uvm
like image 115
fraank Avatar answered Mar 29 '23 19:03

fraank