With Docker, it's very easy to unknowingly bloat your image. Every command creates a new layer, and all layers are saved separately. Therefore, if a big file is generated by one command and removed later in the Dockerfile, it will still add bloat to the size.
In the build stage, the images created from such a Dockerfile will have all the layers above the base image replaced on each rebuild, and will be 'heavy'. Instead, layers containing dependencies should be added before the layers with project code, since the latter change much more often.
The average size of our Docker images were ~300MB - ~600MB. However my new company is using Docker mostly for development workflow, and the average image size is ~1.5GB - ~3GB. Some of the larger images (10GB+) are being actively refactored to reduce the image size.
As @rexposadas said, images include all the layers and each layer includes all the dependencies for what you installed. It is also important to note that the base images (like fedora:latest
tend to be very bare-bones. You may be surprised by the number of dependencies your installed software has.
I was able to make your installation significantly smaller by adding yum -y clean all
to each line:
FROM fedora:latest
RUN yum -y install nano && yum -y clean all
RUN yum -y install git && yum -y clean all
It is important to do that for each RUN, before the layer gets committed, or else deletes don't actually remove data. That is, in a union/copy-on-write file system, cleaning at the end doesn't really reduce file system usage because the real data is already committed to lower layers. To get around this you must clean at each layer.
$ docker history bf5260c6651d
IMAGE CREATED CREATED BY SIZE
bf5260c6651d 4 days ago /bin/sh -c yum -y install git; yum -y clean a 260.7 MB
172743bd5d60 4 days ago /bin/sh -c yum -y install nano; yum -y clean 12.39 MB
3f2fed40e4b0 2 weeks ago /bin/sh -c #(nop) ADD file:cee1a4fcfcd00d18da 372.7 MB
fd241224e9cf 2 weeks ago /bin/sh -c #(nop) MAINTAINER Lokesh Mandvekar 0 B
511136ea3c5a 12 months ago 0 B
Docker images are not large, you are just building large images.
The scratch
image is 0B and you can use that to package up your code if you can compile your code into a static binary. For example, you can compile your Go program and package it on top of scratch
to make a fully usable image that is less than 5MB.
The key is to not use the official Docker images, they are too big. Scratch isn't all that practical either so I'd recommend using Alpine Linux as your base image. It is ~5MB, then only add what is required for your app. This post about Microcontainers shows you how to build very small images base on Alpine.
UPDATE: the official Docker images are based on alpine now so they are good to use now.
Here are some more things you can do:
RUN
commands where you can. Put as much as possbile into one RUN
command (using &&
)With these both AND the recommendations from @Andy and @michau I was able to resize my nodejs image from 1.062 GB to 542 MB.
Edit:
One more important thing:
"It took me a while to really understand that each Dockerfile command creates a new container with the deltas. [...] It doesn't matter if you rm -rf the files in a later command; they continue exist in some intermediate layer container."
So now I managed to put apt-get install
, wget
, npm install
(with git dependencies) and apt-get remove
into a single RUN
command, so now my image has only 438 MB.
Edit 29/06/17
With Docker v17.06 there comes a new features for Dockerfiles:
You can have multiple FROM
statements inside one Dockerfile and only the stuff from last FROM
will be in your final Docker image. This is useful to reduce image size, for example:
FROM nodejs as builder
WORKDIR /var/my-project
RUN apt-get install ruby python git openssh gcc && \
git clone my-project . && \
npm install
FROM nodejs
COPY --from=builder /var/my-project /var/my-project
Will result in an image having only the nodejs base image plus the content from /var/my-project from the first steps - but without the ruby, python, git, openssh and gcc!
Yes, those sizes are ridiculous, and I really have no idea why so few people notice that.
I made an Ubuntu image that is actually minimal (unlike other so-called "minimal" images). It's called textlab/ubuntu-essential
and has 60 MB.
FROM textlab/ubuntu-essential
RUN apt-get update && apt-get -y install nano
The above image is 82 MB after installing nano.
FROM textlab/ubuntu-essential
RUN apt-get update && apt-get -y install nano git
Git has many more prerequisites, so the image gets larger, about 192 MB. That's still less that the initial size of most images.
You can also take a look at the script I wrote to make the minimal Ubuntu image for Docker. You can perhaps adapt it to Fedora, but I'm not sure how much you will be able to uninstall.
The following helped me a lot:
After removing unused packages (e.g. redis 1200 mb freed) inside my container, I have done the following:
The layers get flatten. The size of the new image will be smaller because I've removed packages from the container as stated above.
This took a lot of time to understand this and that's why I've added my comment.
For best practise, you should execute a single RUN command, because every RUN instruction in the Dockerfile writes a new layer in the image and every layer requires extra space on disk. In order to keep the number layers to a minimum, any file manipulation like install, moving, extracting, removing, etc, should ideally be made under a single RUN instruction
FROM fedora:latest
RUN yum -y install nano git && yum -y clean all
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With