I’m running a virtual vachine on <code>GCP</code> with a tesla GPU. And try to deploy a <code>PyTorch</code>-based app to accelerate it with GPU. I want to make docker use this GPU, have access to it from containers. I managed to install all drivers on host machine, and the app runs fine there, but when I try to run it in docker (based on nvidia/cuda container) pytorch fails: <pre class="prettyprint"><code>File "/usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py", line 82, in _check_driver http://www.nvidia.com/Download/index.aspx""") AssertionError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from </code></pre> To get some info about nvidia drivers visible to the container, I run this: <code>docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi</code> But it complains: <code>docker: Error response from daemon: Unknown runtime specified nvidia.</code> On the host machine <code>nvidia-smi</code> output looks like this: <pre class="prettyprint"><code>+-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla P100-PCIE... On | 00000000:00:04.0 Off | 0 | | N/A 39C P0 35W / 250W | 873MiB / 16280MiB | 0% Default | +-------------------------------+----------------------+----------------------+ </code></pre> If I check my runtimes in docker, I get only <code>runc</code> runtime, no <code>nvidia</code> as in examples around the internet. <pre class="prettyprint"><code>$ docker info|grep -i runtime Runtimes: runc Default Runtime: runc </code></pre> How can I add this <code>nvidia</code> runtime environment to my docker? Most posts and questions I found so far say something like "I just forgot to restart my docker daemon, it worked", but it does not help me. Whot should I do? I checked many issues on github, and #1, #2 and #3 StackOverflow questions - didn't help.

The <code>nvidia</code> runtime you need, is <code>nvidia-container-runtime</code>. Follow the installation instructions here: https://github.com/NVIDIA/nvidia-container-runtime#installation Basically, you install it with your package manager first, if it's not present: <code>sudo apt-get install nvidia-container-runtime</code> Then you add it to docker runtimes: https://github.com/nvidia/nvidia-container-runtime#daemon-configuration-file This option worked for me: <pre class="prettyprint"><code>$ sudo tee /etc/docker/daemon.json <<EOF { "runtimes": { "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } } } EOF sudo pkill -SIGHUP dockerd </code></pre> Check that it's added: <pre class="prettyprint"><code>$ docker info|grep -i runtime Runtimes: nvidia runc Default Runtime: runc </code></pre>

Add nvidia runtime to docker runtimes

Tags:

docker

cuda

gpu

nvidia-docker

I’m running a virtual vachine on GCP with a tesla GPU. And try to deploy a PyTorch-based app to accelerate it with GPU.

I want to make docker use this GPU, have access to it from containers.

I managed to install all drivers on host machine, and the app runs fine there, but when I try to run it in docker (based on nvidia/cuda container) pytorch fails:

File "/usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py", line 82, 
in _check_driver http://www.nvidia.com/Download/index.aspx""")
AssertionError: 
Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from

To get some info about nvidia drivers visible to the container, I run this:

docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
But it complains: docker: Error response from daemon: Unknown runtime specified nvidia.

On the host machine nvidia-smi output looks like this:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  On   | 00000000:00:04.0 Off |                    0 |
| N/A   39C    P0    35W / 250W |    873MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

If I check my runtimes in docker, I get only runc runtime, no nvidia as in examples around the internet.

$ docker info|grep -i runtime
 Runtimes: runc
 Default Runtime: runc

How can I add this nvidia runtime environment to my docker?

Most posts and questions I found so far say something like "I just forgot to restart my docker daemon, it worked", but it does not help me. Whot should I do?

I checked many issues on github, and #1, #2 and #3 StackOverflow questions - didn't help.

290

asked Nov 23 '19 13:11

evaleria

1 Answers

The nvidia runtime you need, is nvidia-container-runtime.

Follow the installation instructions here:
https://github.com/NVIDIA/nvidia-container-runtime#installation

Basically, you install it with your package manager first, if it's not present:

sudo apt-get install nvidia-container-runtime

Then you add it to docker runtimes:
https://github.com/nvidia/nvidia-container-runtime#daemon-configuration-file

This option worked for me:

$ sudo tee /etc/docker/daemon.json <<EOF
{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}
EOF
sudo pkill -SIGHUP dockerd

Check that it's added:

$ docker info|grep -i runtime
 Runtimes: nvidia runc
 Default Runtime: runc

135

answered Oct 01 '22 15:10

Viacheslav Shalamov

Related questions
                            
                                Why is the Ubuntu docker image not a VM [duplicate]
                            
                                Access a Caddy server by IP
                            
                                Is it possible to expose a USB device to an LXC/Docker container?
                            
                                Announcing your app from within a container (docker)
                            
                                service tomcat7 start fails, but the process exists and tomcat is running
                            
                                AWS beanstalk environment isn't rotating docker container logs
                            
                                how to check syslog for ubuntu docker
                            
                                docker run script which exports env variables
                            
                                Can I clean /var/lib/docker/tmp?
                            
                                Moving docker-compose containersets around between hosts
                            
                                docker logs and buffered output
                            
                                Unable to run cygwin in Windows Docker Container
                            
                                Gitlab CI / Docker: Use custom image for job
                            
                                How to access application URL hosted in docker container?
                            
                                docker-compose save/load images to another host
                            
                                Is it terrible to use a unix domain socket to connect to Postgresql from a docker container?
                            
                                Workaround to docker run "--env-file" supplied file not being evaluated as expected
                            
                                convert Dockerfile to Bash script
                            
                                Can Docker containers run in Windows IoT Core
                            
                                Puppeteer in docker container: Chromium revision is not downloaded

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With