This is a very similar question to: Docker build: use http cache
I would like to set up a docker container with a custom conda environment. The corresponding dockerfile is:
FROM continuumio/miniconda3
WORKDIR /app
COPY . /app
RUN conda update conda
RUN conda env create -f environment.yml
RUN echo "source activate my_env" > ~/.bashrc
ENV PATH /opt/conda/envs/env/bin:$PATH
My environment is rather large, a minimal version could look like this:
name: my_env
channels:
- defaults
dependencies:
- python=3.6.8=h0371630_0
prefix: /opt/conda
Every time that I make changes to the dependencies, I have to rebuild the image. And that means re-downloading all the packages. Is it possible to set up a cache somehow? Interfacing the containerized conda with a cache outside the container probably breaks the idea of containering it in the first place. But maybe this is still possible somehow ?
With Docker Buildkit there is now a feature for just this, called cache mounts. For the precise Syntax see here. To use this feature, change:
RUN conda env create -f environment.yml
to
RUN --mount=type=cache,target=/opt/conda/pkgs conda env create -f environment.yml
and make sure that Buildkit is enable (eg via export DOCKER_BUILDKIT=1
). The cache will persist between runs and will be shared between concurrent builds.
This is a very indirect answer to the question, but it works like a charm for me.
Out of the many dependencies, there is a large subset which never changes. I always need python 3.6, numpy, pandas, torch, ...
So, instead of caching conda, you can cache docker and reuse a base image with those dependencies already installed:
FROM continuumio/miniconda3
WORKDIR /app
COPY environment.yml /app
# install package dependencies
RUN conda update conda
RUN conda env create -f environment.yml
RUN echo "source activate api_neural" > ~/.bashrc
ENV PATH /opt/conda/envs/env/bin:$PATH
Then you can add additional config on top of this, in a second dockerfile:
FROM base_deps
# add additional things on top, here I'm running some python in the conda env
RUN /bin/bash -c 'echo $(which python);\
source activate api_neural;\
python -c "import nltk; nltk.download(\"wordnet\"); nltk.download(\"words\")";\
python -m spacy download en;\
python -c "from fastai import untar_data, URLs; model_path = untar_data(URLs.WT103, data=False)"'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With