I am trying to build a Docker container on a server within which a conda environment is built. All the other requirements are satisfied except for CUDA enabled PyTorch (I can get PyTorch working without CUDA however, no problem). How do I make sure PyTorch is using CUDA? This is the <code>Dockerfile</code> : <pre class="prettyprint"><code># Use nvidia/cuda image FROM nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04 # set bash as current shell RUN chsh -s /bin/bash # install anaconda RUN apt-get update RUN apt-get install -y wget bzip2 ca-certificates libglib2.0-0 libxext6 libsm6 libxrender1 git mercurial subversion && \ apt-get clean RUN wget --quiet https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh -O ~/anaconda.sh && \ /bin/bash ~/anaconda.sh -b -p /opt/conda && \ rm ~/anaconda.sh && \ ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \ echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \ find /opt/conda/ -follow -type f -name '*.a' -delete && \ find /opt/conda/ -follow -type f -name '*.js.map' -delete && \ /opt/conda/bin/conda clean -afy # set path to conda ENV PATH /opt/conda/bin:$PATH # setup conda virtual environment COPY ./requirements.yaml /tmp/requirements.yaml RUN conda update conda \ && conda env create --name camera-seg -f /tmp/requirements.yaml \ && conda install -y -c conda-forge -n camera-seg flake8 # From the pythonspeed tutorial; Make RUN commands use the new environment SHELL ["conda", "run", "-n", "camera-seg", "/bin/bash", "-c"] # PyTorch with CUDA 10.2 RUN conda activate camera-seg && conda install pytorch torchvision cudatoolkit=10.2 -c pytorch RUN echo "conda activate camera-seg" > ~/.bashrc ENV PATH /opt/conda/envs/camera-seg/bin:$PATH </code></pre> This gives me the following error when I try to build this container ( <code>docker build -t camera-seg .</code> ): <pre class="prettyprint"><code>..... Step 10/12 : RUN conda activate camera-seg && conda install pytorch torchvision cudatoolkit=10.2 -c pytorch ---> Running in e0dd3e648f7b ERROR conda.cli.main_run:execute(34): Subprocess for 'conda run ['/bin/bash', '-c', 'conda activate camera-seg && conda install pytorch torchvision cudatoolkit=10.2 -c pytorch']' command failed. (See above for error) CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'. To initialize your shell, run $ conda init <SHELL_NAME> Currently supported shells are: - bash - fish - tcsh - xonsh - zsh - powershell See 'conda init --help' for more information and options. IMPORTANT: You may need to close and restart your shell after running 'conda init'. The command 'conda run -n camera-seg /bin/bash -c conda activate camera-seg && conda install pytorch torchvision cudatoolkit=10.2 -c pytorch' returned a non-zero code: 1 </code></pre> This is the <code>requirements.yaml</code>: <pre class="prettyprint"><code>name: camera-seg channels: - defaults - conda-forge dependencies: - python=3.6 - numpy - pillow - yaml - pyyaml - matplotlib - jupyter - notebook - tensorboardx - tensorboard - protobuf - tqdm </code></pre> When I put <code>pytorch</code>, <code>torchvision</code> and <code>cudatoolkit=10.2</code> within the <code>requirements.yaml</code>, then PyTorch is successfully installed but it cannot recognize CUDA ( <code>torch.cuda.is_available()</code> returns <code>False</code> ). I have tried various solutions, for example, this, this and this and some different combinations of them but all to no avail. Any help is much appreciated. Thanks.

I got it working after many, many tries. Posting the answer here in case it helps anyone. Basically, I installed <code>pytorch</code> and <code>torchvision</code> through <code>pip</code> (from within the <code>conda</code> environment) and rest of the dependencies through <code>conda</code> as usual. This is how the final <code>Dockerfile</code> looks: <pre class="prettyprint"><code># Use nvidia/cuda image FROM nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04 # set bash as current shell RUN chsh -s /bin/bash SHELL ["/bin/bash", "-c"] # install anaconda RUN apt-get update RUN apt-get install -y wget bzip2 ca-certificates libglib2.0-0 libxext6 libsm6 libxrender1 git mercurial subversion && \ apt-get clean RUN wget --quiet https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh -O ~/anaconda.sh && \ /bin/bash ~/anaconda.sh -b -p /opt/conda && \ rm ~/anaconda.sh && \ ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \ echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \ find /opt/conda/ -follow -type f -name '*.a' -delete && \ find /opt/conda/ -follow -type f -name '*.js.map' -delete && \ /opt/conda/bin/conda clean -afy # set path to conda ENV PATH /opt/conda/bin:$PATH # setup conda virtual environment COPY ./requirements.yaml /tmp/requirements.yaml RUN conda update conda \ && conda env create --name camera-seg -f /tmp/requirements.yaml RUN echo "conda activate camera-seg" >> ~/.bashrc ENV PATH /opt/conda/envs/camera-seg/bin:$PATH ENV CONDA_DEFAULT_ENV $camera-seg </code></pre> And this is how the <code>requirements.yaml</code> looks like: <pre class="prettyprint"><code>name: camera-seg channels: - defaults - conda-forge dependencies: - python=3.6 - pip - numpy - pillow - yaml - pyyaml - matplotlib - jupyter - notebook - tensorboardx - tensorboard - protobuf - tqdm - pip: - torch - torchvision </code></pre> Then I build the container using the command <code>docker build -t camera-seg .</code> and PyTorch is now being able to recognize CUDA.

How to install CUDA enabled PyTorch in a Docker container?

Tags:

docker

python-3.x

anaconda

pytorch

I am trying to build a Docker container on a server within which a conda environment is built. All the other requirements are satisfied except for CUDA enabled PyTorch (I can get PyTorch working without CUDA however, no problem). How do I make sure PyTorch is using CUDA?

This is the Dockerfile :

# Use nvidia/cuda image
FROM nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04

# set bash as current shell
RUN chsh -s /bin/bash

# install anaconda
RUN apt-get update
RUN apt-get install -y wget bzip2 ca-certificates libglib2.0-0 libxext6 libsm6 libxrender1 git mercurial subversion && \
        apt-get clean
RUN wget --quiet https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh -O ~/anaconda.sh && \
        /bin/bash ~/anaconda.sh -b -p /opt/conda && \
        rm ~/anaconda.sh && \
        ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \
        echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \
        find /opt/conda/ -follow -type f -name '*.a' -delete && \
        find /opt/conda/ -follow -type f -name '*.js.map' -delete && \
        /opt/conda/bin/conda clean -afy

# set path to conda
ENV PATH /opt/conda/bin:$PATH


# setup conda virtual environment
COPY ./requirements.yaml /tmp/requirements.yaml
RUN conda update conda \
    && conda env create --name camera-seg -f /tmp/requirements.yaml \
    && conda install -y -c conda-forge -n camera-seg flake8

# From the pythonspeed tutorial; Make RUN commands use the new environment
SHELL ["conda", "run", "-n", "camera-seg", "/bin/bash", "-c"]

# PyTorch with CUDA 10.2
RUN conda activate camera-seg && conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

RUN echo "conda activate camera-seg" > ~/.bashrc
ENV PATH /opt/conda/envs/camera-seg/bin:$PATH

This gives me the following error when I try to build this container ( docker build -t camera-seg . ):

.....

Step 10/12 : RUN conda activate camera-seg && conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
 ---> Running in e0dd3e648f7b
ERROR conda.cli.main_run:execute(34): Subprocess for 'conda run ['/bin/bash', '-c', 'conda activate camera-seg && conda install pytorch torchvision cudatoolkit=10.2 -c pytorch']' command failed.  (See above for error)

CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.



The command 'conda run -n camera-seg /bin/bash -c conda activate camera-seg && conda install pytorch torchvision cudatoolkit=10.2 -c pytorch' returned a non-zero code: 1

This is the requirements.yaml:

name: camera-seg
channels:
  - defaults
  - conda-forge
dependencies:
  - python=3.6
  - numpy
  - pillow
  - yaml
  - pyyaml
  - matplotlib
  - jupyter
  - notebook
  - tensorboardx
  - tensorboard
  - protobuf
  - tqdm

When I put pytorch, torchvision and cudatoolkit=10.2 within the requirements.yaml, then PyTorch is successfully installed but it cannot recognize CUDA ( torch.cuda.is_available() returns False ).

I have tried various solutions, for example, this, this and this and some different combinations of them but all to no avail.

Any help is much appreciated. Thanks.

959

asked Dec 29 '20 12:12

Rahul Bohare

1 Answers

I got it working after many, many tries. Posting the answer here in case it helps anyone.

Basically, I installed pytorch and torchvision through pip (from within the conda environment) and rest of the dependencies through conda as usual.

This is how the final Dockerfile looks:

# Use nvidia/cuda image
FROM nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04

# set bash as current shell
RUN chsh -s /bin/bash
SHELL ["/bin/bash", "-c"]

# install anaconda
RUN apt-get update
RUN apt-get install -y wget bzip2 ca-certificates libglib2.0-0 libxext6 libsm6 libxrender1 git mercurial subversion && \
        apt-get clean
RUN wget --quiet https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh -O ~/anaconda.sh && \
        /bin/bash ~/anaconda.sh -b -p /opt/conda && \
        rm ~/anaconda.sh && \
        ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \
        echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \
        find /opt/conda/ -follow -type f -name '*.a' -delete && \
        find /opt/conda/ -follow -type f -name '*.js.map' -delete && \
        /opt/conda/bin/conda clean -afy

# set path to conda
ENV PATH /opt/conda/bin:$PATH


# setup conda virtual environment
COPY ./requirements.yaml /tmp/requirements.yaml
RUN conda update conda \
    && conda env create --name camera-seg -f /tmp/requirements.yaml

RUN echo "conda activate camera-seg" >> ~/.bashrc
ENV PATH /opt/conda/envs/camera-seg/bin:$PATH
ENV CONDA_DEFAULT_ENV $camera-seg

And this is how the requirements.yaml looks like:

name: camera-seg
channels:
  - defaults
  - conda-forge
dependencies:
  - python=3.6
  - pip
  - numpy
  - pillow
  - yaml
  - pyyaml
  - matplotlib
  - jupyter
  - notebook
  - tensorboardx
  - tensorboard
  - protobuf
  - tqdm
  - pip:
    - torch
    - torchvision

Then I build the container using the command docker build -t camera-seg . and PyTorch is now being able to recognize CUDA.

153

answered Nov 15 '22 23:11

Rahul Bohare

Related questions
                            
                                Python (openpyxl) : Put data from one excel file to another (template file) & save it with another name while retaining the template
                            
                                ImportError: No module named 'tensorflow.core'
                            
                                pymysql stopped working : NameError: name 'byte2int' is not defined
                            
                                AttributeError: 'NoneType' object has no attribute 'drivername'
                            
                                Django extract string from [ErrorDetail(string='Test Message', code='invalid')]
                            
                                Pandas/Numpy NaN None comparison
                            
                                Why does the UnboundLocalError occur on the second variable of the flat comprehension?
                            
                                The axis argument to unique is not supported for dtype object
                            
                                How to make a tkinter canvas background transparent?
                            
                                I can't import tensorflow-gpu
                            
                                Run command from one container to another
                            
                                How to add multiple images to a django form asynchronously before form submit
                            
                                Create a custom federated data set in TensorFlow Federated
                            
                                How to run an Asyncio task without awaiting?
                            
                                How to fix <Response 500> error in python requests?
                            
                                Non-Blocking Websocket Receive with Asyncio
                            
                                Python 3.x rounding half up
                            
                                Cython returns 0 for expression that should evaluate to 0.5?
                            
                                How to make attribute in dataclass read-only?
                            
                                Python change Exception printable output, eg overload __builtins__

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to install CUDA enabled PyTorch in a Docker container?

Tags:

docker

python-3.x

anaconda

pytorch

Rahul Bohare

People also ask

1 Answers

Rahul Bohare

Recent Activity

Donate For Us