Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to build an `nvidia/cuda`-based image on a server without a GPU?

I have a Dockerfile based on nvidia/cuda like so:

FROM nvidia/cuda:11.0-base

...

I want to be able to build this Dockerfile on our CI server that does not have a Nvidia GPU. When I try to do that, I get this error:

------
 > [1/6] FROM docker.io/nvidia/cuda:11.0-base:
------
failed to solve with frontend dockerfile.v0: failed to solve with frontend gateway.v0: rpc error: code = Unknown desc = failed to build LLB: failed to load cache key: docker.io/nvidia/cuda:11.0-base not found

The error says that the image is not found, but I think this is a bit misleading. I've been able to isolate the problem to whether or not a GPU is present.

When building this Dockerfile on a server with a Nvidia GPU, I don't get this error. Is it possible to build a Dockerfile based on an nvidia/cuda image on a server without a GPU? This would save costs on our CI server.

I plan to deploy the resulting docker container on a server that does have a GPU so, in other words, is it possible to defer the presence of a GPU to run time instead of build time?

like image 746
Mario Ishac Avatar asked Aug 07 '20 21:08

Mario Ishac


People also ask

Can a Docker container use GPU?

However, Docker® containers are most commonly used to easily deploy CPU-based applications on several machines, where containers are both hardware- and platform-agnostic. The Docker engine doesn't natively support NVIDIA GPUs as it uses specialized hardware that requires the NVIDIA driver to be installed.

How do I know if my system is CUDA capable?

You can verify that you have a CUDA-capable GPU through the Display Adapters section in the Windows Device Manager. Here you will find the vendor name and model of your graphics card(s). If you have an NVIDIA card that is listed in http://developer.nvidia.com/cuda-gpus, that GPU is CUDA-capable.

What is difference between Docker and nvidia Docker?

nvidia-docker is essentially a wrapper around the docker command that transparently provisions a container with the necessary components to execute code on the GPU. It is only absolutely necessary when using nvidia-docker run to execute a container that uses GPUs.


1 Answers

It sounds like you may need to load the nvidia components possibly including any proprietary blobs and kernel modules. If the modules are not present, this could be why the compile error (missing dependencies).

But from this website https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html it looks like the drivers are looking for the hardware when they load, which is probably why they are not available when you attempt to compile.

like image 80
Hmbl Stdnt Avatar answered Oct 04 '22 00:10

Hmbl Stdnt