I’m running a virtual vachine on GCP
with a tesla GPU.
And try to deploy a PyTorch
-based app to accelerate it with GPU.
I want to make docker use this GPU, have access to it from containers.
I managed to install all drivers on host machine, and the app runs fine there, but when I try to run it in docker (based on nvidia/cuda container) pytorch fails:
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py", line 82,
in _check_driver http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from
To get some info about nvidia drivers visible to the container, I run this:
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
But it complains: docker: Error response from daemon: Unknown runtime specified nvidia.
On the host machine nvidia-smi
output looks like this:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... On | 00000000:00:04.0 Off | 0 |
| N/A 39C P0 35W / 250W | 873MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
If I check my runtimes in docker, I get only runc
runtime, no nvidia
as in examples around the internet.
$ docker info|grep -i runtime
Runtimes: runc
Default Runtime: runc
How can I add this nvidia
runtime environment to my docker?
Most posts and questions I found so far say something like "I just forgot to restart my docker daemon, it worked", but it does not help me. Whot should I do?
I checked many issues on github, and #1, #2 and #3 StackOverflow questions - didn't help.
Adding the NVIDIA Drivers You should be able to successfully run nvidia-smi and see your GPU's name, driver version, and CUDA version. To use your GPU with Docker, begin by adding the NVIDIA Container Toolkit to your host. This integrates into Docker Engine to automatically configure your containers for GPU support.
NVIDIANVIDIANvidia Corporation (/ɛnˈvɪdiə/ en-VID-ee-ə) is an American multinational technology company incorporated in Delaware and based in Santa Clara, California.https://en.wikipedia.org › wiki › NvidiaNvidia - Wikipedia Container Runtime is a GPU aware container runtime, compatible with the Open Containers Initiative (OCI) specification used by Docker, CRI-O, and other popular container technologies. It simplifies the process of building and deploying containerized GPU-accelerated applications to desktop, cloud or data centers.
nvidia-docker is essentially a wrapper around the docker command that transparently provisions a container with the necessary components to execute code on the GPU. It is only absolutely necessary when using nvidia-docker run to execute a container that uses GPUs.
The nvidia
runtime you need, is nvidia-container-runtime
.
Follow the installation instructions here:
https://github.com/NVIDIA/nvidia-container-runtime#installation
Basically, you install it with your package manager first, if it's not present:
sudo apt-get install nvidia-container-runtime
Then you add it to docker runtimes:
https://github.com/nvidia/nvidia-container-runtime#daemon-configuration-file
This option worked for me:
$ sudo tee /etc/docker/daemon.json <<EOF
{
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
EOF
sudo pkill -SIGHUP dockerd
Check that it's added:
$ docker info|grep -i runtime
Runtimes: nvidia runc
Default Runtime: runc
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With