Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I reduce a python (docker) image size using a multi-stage build?

I am looking for a way to create multistage builds with python and Dockerfile:

For example, using the following images:

1st image: install all compile-time requirements, and install all needed python modules

2nd image: copy all compiled/built packages from the first image to the second, without the compilers themselves (gcc, postgers-dev, python-dev, etc..)

The final objective is to have a smaller image, running python and the python packages that I need.

In short: how can I 'wrap' all the compiled modules (site-packages / external libs) that were created in the first image, and copy them in a 'clean' manner, to the 2nd image.

like image 573
gCoh Avatar asked Jan 31 '18 13:01

gCoh


People also ask

What is the purpose of multi-stage builds in Docker?

A multistage build allows you to use multiple images to build a final product. In a multistage build, you have a single Dockerfile, but can define multiple images inside it to help build the final image.

What is the purpose of multi-stage builds?

With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don't want in the final image.


2 Answers

ok so my solution is using wheel, it lets us compile on first image, create wheel files for all dependencies and install them in the second image, without installing the compilers

FROM python:2.7-alpine as base  RUN mkdir /svc COPY . /svc WORKDIR /svc  RUN apk add --update \     postgresql-dev \     gcc \     musl-dev \     linux-headers  RUN pip install wheel && pip wheel . --wheel-dir=/svc/wheels  FROM python:2.7-alpine  COPY --from=base /svc /svc  WORKDIR /svc  RUN pip install --no-index --find-links=/svc/wheels -r requirements.txt 

You can see my answer regarding this in the following blog post

https://www.blogfoobar.com/post/2018/02/10/python-and-docker-multistage-build

like image 64
gCoh Avatar answered Sep 19 '22 14:09

gCoh


I recommend the approach detailed in this article (section 2). He uses virtualenv so pip install stores all the python code, binaries, etc. under one folder instead of spread out all over the file system. Then it's easy to copy just that one folder to the final "production" image. In summary:

Compile image

  • Activate virtualenv in some path of your choosing.
  • Prepend that path to your docker ENV. This is all virtualenv needs to function for all future docker RUN and CMD action.
  • Install system dev packages and pip install xyz as usual.

Production image

  • Copy the virtualenv folder from the Compile Image.
  • Prepend the virtualenv folder to docker's PATH
like image 35
mpoisot Avatar answered Sep 18 '22 14:09

mpoisot