My Dockerfile is something like
FROM my/base
ADD . /srv
RUN pip install -r requirements.txt
RUN python setup.py install
ENTRYPOINT ["run_server"]
Every time I build a new image, dependencies have to be reinstalled, which could be very slow in my region.
One way I think of to cache
packages that have been installed is to override the my/base
image with newer images like this:
docker build -t new_image_1 .
docker tag new_image_1 my/base
So next time I build with this Dockerfile, my/base already has some packages installed.
But this solution has two problems:
So what better solution could I use to solve this problem?
Some information about the docker on my machine:
☁ test docker version
Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.1
Git commit (client): d84a070
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.1
Git commit (server): d84a070
☁ test docker info
Containers: 0
Images: 56
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Dirs: 56
Execution Driver: native-0.2
Kernel Version: 3.13.0-29-generic
WARNING: No swap limit support
Try to use docker as environment foundation for applications in host machine is not a good choice, docker is designed to run applications environment-independent. Applications should be placed into docker image via docker build and run in docker container in runtime.
To install packages in a docker container, the packages should be defined in the Dockerfile. If you want to install packages in the Container, use the RUN statement followed by exact download command . You can update the Dockerfile with latest list of packages at anytime and build again to create new image out of it.
You don't need to rebuild your Docker image in development for each tiny code change. If you mount your code into your dev container, you don't have to build a new image on every code change and iterate faster. It's a great feeling when you make changes and see the results right away!
COPY is a docker file command that copies files from a local source location to a destination in the Docker container. ADD command is used to copy files/directories into a Docker image. It only has only one assigned function. It can also copy files from a URL.
Try to build a Dockerfile which looks something like this:
FROM my/base
WORKDIR /srv
ADD ./requirements.txt /srv/requirements.txt
RUN pip install -r requirements.txt
ADD . /srv
RUN python setup.py install
ENTRYPOINT ["run_server"]
Docker will use cache during pip install as long as you do not make any changes to the requirements.txt
, irrespective of the fact whether other code files at .
were changed or not. Here's an example.
Here's a simple Hello, World!
program:
$ tree
.
├── Dockerfile
├── requirements.txt
└── run.py
0 directories, 3 file
# Dockerfile
FROM dockerfile/python
WORKDIR /srv
ADD ./requirements.txt /srv/requirements.txt
RUN pip install -r requirements.txt
ADD . /srv
CMD python /srv/run.py
# requirements.txt
pytest==2.3.4
# run.py
print("Hello, World")
The output of docker build:
Step 1 : WORKDIR /srv
---> Running in 22d725d22e10
---> 55768a00fd94
Removing intermediate container 22d725d22e10
Step 2 : ADD ./requirements.txt /srv/requirements.txt
---> 968a7c3a4483
Removing intermediate container 5f4e01f290fd
Step 3 : RUN pip install -r requirements.txt
---> Running in 08188205e92b
Downloading/unpacking pytest==2.3.4 (from -r requirements.txt (line 1))
Running setup.py (path:/tmp/pip_build_root/pytest/setup.py) egg_info for package pytest
....
Cleaning up...
---> bf5c154b87c9
Removing intermediate container 08188205e92b
Step 4 : ADD . /srv
---> 3002a3a67e72
Removing intermediate container 83defd1851d0
Step 5 : CMD python /srv/run.py
---> Running in 11e69b887341
---> 5c0e7e3726d6
Removing intermediate container 11e69b887341
Successfully built 5c0e7e3726d6
Let's modify run.py
:
# run.py
print("Hello, Python")
Try to build again, below is the output:
Sending build context to Docker daemon 5.12 kB
Sending build context to Docker daemon
Step 0 : FROM dockerfile/python
---> f86d6993fc7b
Step 1 : WORKDIR /srv
---> Using cache
---> 55768a00fd94
Step 2 : ADD ./requirements.txt /srv/requirements.txt
---> Using cache
---> 968a7c3a4483
Step 3 : RUN pip install -r requirements.txt
---> Using cache
---> bf5c154b87c9
Step 4 : ADD . /srv
---> 9cc7508034d6
Removing intermediate container 0d7cf71eb05e
Step 5 : CMD python /srv/run.py
---> Running in f25c21135010
---> 4ffab7bc66c7
Removing intermediate container f25c21135010
Successfully built 4ffab7bc66c7
As you can see above, this time docker uses cache during the build. Now, let's update requirements.txt
:
# requirements.txt
pytest==2.3.4
ipython
Below is the output of docker build:
Sending build context to Docker daemon 5.12 kB
Sending build context to Docker daemon
Step 0 : FROM dockerfile/python
---> f86d6993fc7b
Step 1 : WORKDIR /srv
---> Using cache
---> 55768a00fd94
Step 2 : ADD ./requirements.txt /srv/requirements.txt
---> b6c19f0643b5
Removing intermediate container a4d9cb37dff0
Step 3 : RUN pip install -r requirements.txt
---> Running in 4b7a85a64c33
Downloading/unpacking pytest==2.3.4 (from -r requirements.txt (line 1))
Running setup.py (path:/tmp/pip_build_root/pytest/setup.py) egg_info for package pytest
Downloading/unpacking ipython (from -r requirements.txt (line 2))
Downloading/unpacking py>=1.4.12 (from pytest==2.3.4->-r requirements.txt (line 1))
Running setup.py (path:/tmp/pip_build_root/py/setup.py) egg_info for package py
Installing collected packages: pytest, ipython, py
Running setup.py install for pytest
Installing py.test script to /usr/local/bin
Installing py.test-2.7 script to /usr/local/bin
Running setup.py install for py
Successfully installed pytest ipython py
Cleaning up...
---> 23a1af3df8ed
Removing intermediate container 4b7a85a64c33
Step 4 : ADD . /srv
---> d8ae270eca35
Removing intermediate container 7f003ebc3179
Step 5 : CMD python /srv/run.py
---> Running in 510359cf9e12
---> e42fc9121a77
Removing intermediate container 510359cf9e12
Successfully built e42fc9121a77
Notice how docker didn't use cache during pip install. If it doesn't work, check your docker version.
Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.1
Git commit (client): d84a070
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.1
Git commit (server): d84a070
I understand this question has some popular answers already. But there is a newer way to cache files for package managers. I think it could be a good answer in the future when BuildKit becomes more standard.
As of Docker 18.09 there is experimental support for BuildKit. BuildKit adds support for some new features in the Dockerfile including experimental support for mounting external volumes into RUN
steps. This allows us to create caches for things like $HOME/.cache/pip/
.
We'll use the following requirements.txt
file as an example:
Click==7.0
Django==2.2.3
django-appconf==1.0.3
django-compressor==2.3
django-debug-toolbar==2.0
django-filter==2.2.0
django-reversion==3.0.4
django-rq==2.1.0
pytz==2019.1
rcssmin==1.0.6
redis==3.3.4
rjsmin==1.1.0
rq==1.1.0
six==1.12.0
sqlparse==0.3.0
A typical example Python Dockerfile
might look like:
FROM python:3.7
WORKDIR /usr/src/app
COPY requirements.txt /usr/src/app/
RUN pip install -r requirements.txt
COPY . /usr/src/app
With BuildKit enabled using the DOCKER_BUILDKIT
environment variable we can build the uncached pip
step in about 65 seconds:
$ export DOCKER_BUILDKIT=1
$ docker build -t test .
[+] Building 65.6s (10/10) FINISHED
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 120B 0.0s
=> [internal] load metadata for docker.io/library/python:3.7 0.5s
=> CACHED [1/4] FROM docker.io/library/python:3.7@sha256:6eaf19442c358afc24834a6b17a3728a45c129de7703d8583392a138ecbdb092 0.0s
=> [internal] load build context 0.6s
=> => transferring context: 899.99kB 0.6s
=> CACHED [internal] helper image for file operations 0.0s
=> [2/4] COPY requirements.txt /usr/src/app/ 0.5s
=> [3/4] RUN pip install -r requirements.txt 61.3s
=> [4/4] COPY . /usr/src/app 1.3s
=> exporting to image 1.2s
=> => exporting layers 1.2s
=> => writing image sha256:d66a2720e81530029bf1c2cb98fb3aee0cffc2f4ea2aa2a0760a30fb718d7f83 0.0s
=> => naming to docker.io/library/test 0.0s
Now, let us add the experimental header and modify the RUN
step to cache the Python packages:
# syntax=docker/dockerfile:experimental
FROM python:3.7
WORKDIR /usr/src/app
COPY requirements.txt /usr/src/app/
RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt
COPY . /usr/src/app
Go ahead and do another build now. It should take the same amount of time. But this time it is caching the Python packages in our new cache mount:
$ docker build -t pythontest .
[+] Building 60.3s (14/14) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 120B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> resolve image config for docker.io/docker/dockerfile:experimental 0.5s
=> CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:9022e911101f01b2854c7a4b2c77f524b998891941da55208e71c0335e6e82c3 0.0s
=> [internal] load .dockerignore 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 120B 0.0s
=> [internal] load metadata for docker.io/library/python:3.7 0.5s
=> CACHED [1/4] FROM docker.io/library/python:3.7@sha256:6eaf19442c358afc24834a6b17a3728a45c129de7703d8583392a138ecbdb092 0.0s
=> [internal] load build context 0.7s
=> => transferring context: 899.99kB 0.6s
=> CACHED [internal] helper image for file operations 0.0s
=> [2/4] COPY requirements.txt /usr/src/app/ 0.6s
=> [3/4] RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt 53.3s
=> [4/4] COPY . /usr/src/app 2.6s
=> exporting to image 1.2s
=> => exporting layers 1.2s
=> => writing image sha256:0b035548712c1c9e1c80d4a86169c5c1f9e94437e124ea09e90aea82f45c2afc 0.0s
=> => naming to docker.io/library/test 0.0s
About 60 seconds. Similar to our first build.
Make a small change to the requirements.txt
(such as adding a new line between two packages) to force a cache invalidation and run again:
$ docker build -t pythontest .
[+] Building 15.9s (14/14) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 120B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> resolve image config for docker.io/docker/dockerfile:experimental 1.1s
=> CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:9022e911101f01b2854c7a4b2c77f524b998891941da55208e71c0335e6e82c3 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 120B 0.0s
=> [internal] load .dockerignore 0.0s
=> [internal] load metadata for docker.io/library/python:3.7 0.5s
=> CACHED [1/4] FROM docker.io/library/python:3.7@sha256:6eaf19442c358afc24834a6b17a3728a45c129de7703d8583392a138ecbdb092 0.0s
=> CACHED [internal] helper image for file operations 0.0s
=> [internal] load build context 0.7s
=> => transferring context: 899.99kB 0.7s
=> [2/4] COPY requirements.txt /usr/src/app/ 0.6s
=> [3/4] RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt 8.8s
=> [4/4] COPY . /usr/src/app 2.1s
=> exporting to image 1.1s
=> => exporting layers 1.1s
=> => writing image sha256:fc84cd45482a70e8de48bfd6489e5421532c2dd02aaa3e1e49a290a3dfb9df7c 0.0s
=> => naming to docker.io/library/test 0.0s
Only about 16 seconds!
We are getting this speedup because we are no longer downloading all the Python packages. They were cached by the package manager (pip
in this case) and stored in a cache volume mount. The volume mount is provided to the run step so that pip
can reuse our already downloaded packages. This happens outside any Docker layer caching.
The gains should be much better on larger requirements.txt
.
Notes:
docker system prune -a
.Hopefully, these features will make it into Docker for building and BuildKit will become the default. If / when that happens I will try to update this answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With