I'm on Ubuntu 14.04, CUDA toolkit 8, driver version 367.48.
When I give nvidia-smi
command, it just hangs indefinitely.
When I login again and try to kill that nvidia-smi
process, with kill -9 <PID>
for example, it just isn't killed.
If I give another nvidia-smi
command, I find both the processes running - of course when logging from another shell, because that gets stuck as before.
Can it be an issue related to the driver? It's not the latest, but still quite new..
The NVIDIA System Management Interface (nvidia-smi) is a command line utility, based on top of the NVIDIA Management Library (NVML), intended to aid in the management and monitoring of NVIDIA GPU devices.
Replacing the nvidia driver itself can indeed be done without reboot with "sudo rmmod nvidia" & "sudo nvidia-smi". You should anyway make sure that no current cuda processes are running.
I solved this problem by doing at every boot
sudo nvidia-smi -pm 1
The above command enables persistence mode. This issue has been affecting nvidia drivers for over two years but they don't seem interested in fixing it. It seems to be related with a power management issue, after a bit of booting into the OS, if the nvidia-persistenced
service has the no-persistence-mode
option enabled, the GPU will save power, and the nvidia-smi
command will hang waiting for something giving it control again on the device
Given your peculiar situation, I would try to reinstall it, as bio proposed.
Have you tried doing sudo kill -9 <PID>
? You probably have but still putting it out there. Or, perhaps doing sudo kill -15 <PID>
to terminate it. This seems as if your driver is stuck in a signal 1
hangup given what you told us.
It seems odd that nvidia-smi
would hang spontaneously when run, but the issue may underlie in not being installed correctly or not getting run with superuser access.
Have you tried to use:
service nvidia-smi status
pgrep nvidia-smi
ps -aux | grep nvidia-smi
to get its current state?
Anyway, hope this helps. I would try to uninstall and reinstall or use sudo apt --fix-broken
to try and fix broken packages/drivers.
Cheers!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With