Dockerfile: Benefits of repeated apt cache cleans

Tags:

In the quest for ever smaller Docker images, it's common to remove the apt (for Debian/Ubuntu based images) cache after installing packages. Something like

RUN rm -rf /var/lib/apt/lists/*

I've seen a few Dockerfiles where this is done after each package installation (example), i.e. with the pattern

# Install some package
RUN apt-get update \
    && apt-get install -y <some-package> \
    && rm -rf /var/lib/apt/lists/*

# Do something
...

# Install another package
RUN apt-get update \
    && apt-get install -y <another-package> \
    && rm -rf /var/lib/apt/lists/*

# Do something else
...

Are there any benefits of doing this, rather than only cleaning the apt cache at the very end (and thus only updating it once at the beginning)? To me it seems like having to remove and update the cache multiple times just slows down the image build.

419

asked May 24 '20 18:05

jmd_dk

1 Answers

The main reason people do this is to minimise the amount of data stored in that particular docker layer. When pulling a docker image, you have to pull the entire content of the layer.

For example, imagine the following two layers in the image:

RUN apt-get update
RUN rm -rf /var/lib/apt/lists/*

The first RUN command results in a layer containing the lists, which will ALWAYS be pulled by anyone using your image, even though the next command removes those files (so they're not accessible). Ultimately those extra files are just a waste of space and time.

On the other hand,

RUN apt-get update && rm -rf /var/lib/apt/lists/*

Doing it within a single layer, those lists are deleted before the layer is finished, so they are never pushed or pulled as part of the image.

So, why have multiple layers which use apt-get install? This is likely so that people can make better use of layers in other images, as Docker will share layers between images if they're identical in order to save space on the server and speed up builds and pulls.

109

answered Oct 22 '22 18:10

Ben XO

Related questions
                            
                                permission denied, mkdir in container on openshift
                            
                                Can Docker COPY commands be chained
                            
                                Docker Unknown flag --mount
                            
                                Dot and colon meaning
                            
                                Bumping package.json version without invalidating docker cache
                            
                                How to make GitLab Runner in Docker see a custom CA Root certificate
                            
                                Deploy WAR in Tomcat on Kubernetes
                            
                                debug spring-boot in docker
                            
                                Output of `tail -f` at the end of a docker CMD is not showing
                            
                                docker-compose up and user inputs on stdin
                            
                                Quotes on docker-compose.yml ports make any difference?
                            
                                Architecture of a Docker multi-apps server regarding to database
                            
                                How to add a file to an image in Dockerfile without using the ADD or COPY directive
                            
                                How to connect to local MySQL server through Docker?
                            
                                How do you set-up Mongo replica set on Kubernetes?
                            
                                Best practice for connecting to a vpn though docker [closed]
                            
                                Connecting to docker-in-docker from a GitLab CI runner
                            
                                How to achieve a rolling update with docker-compose?
                            
                                Is there a way to directly deploy a container from docker hub to google compute engine?
                            
                                How to pass GitLab CI file variable to Dockerfile and docker container?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Dockerfile: Benefits of repeated apt cache cleans

Tags:

docker

dockerfile

ubuntu

debian

apt

jmd_dk

People also ask

1 Answers

Ben XO

Recent Activity

Donate For Us