Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I prevent a Dockerfile instruction from being cached?

In my Dockerfile I use curl or ADD to download the latest version of an archive like:

FROM debian:jessie
...
RUN apt-get install -y curl
...
RUN curl -sL http://example.com/latest/archive.tar.gz --output archive.tar.gz
...
ADD http://example.com/latest/archive2.tar.gz
...

The RUN statement that uses curl or ADD creates its own image layer. That will be used as a cache for future executions of docker build.

Question: How can I disable caching for that instructions?

It would be great to get something like cache invalidation working there. E.g. by using HTTP ETags or by querying the last modified header field. That would give the possibility to do a quick check based on the HTTP headers to decide whether a cached layer could be used or not.

I know that some dirty tricks could help e.g. executing a download shell script in the RUN statement instead. Its filename will be changed before the docker build is triggered by our build system. And I could do the HTTP checks inside that script. But then I need to store either the last used ETag or the last modified to a file somewhere. I am wondering whether there is some more clean and native Docker functionality that I could use, here.

like image 407
Henrik Sachse Avatar asked Aug 03 '15 08:08

Henrik Sachse


People also ask

How do I create a Dockerfile without cache?

$ docker build --no-cache -t sample-image:sample-tag . When you execute this command, the daemon will not look for cache builds of existing image layers and will force a clean build of the Docker image from the Dockerfile. If you use Docker compose, you can use the following command.

Does Docker cache COPY command?

About Layer Caching in Docker Docker uses a layer cache to optimize and speed up the process of building Docker images. Docker Layer Caching mainly works on the RUN , COPY and ADD commands, which will be explained in more detail next.

Does Docker build cache?

Docker's build-cache is a handy feature. It speeds up Docker builds due to reusing previously created layers. You can use the --no-cache option to disable caching or use a custom Docker build argument to enforce rebuilding from a certain step.

How do I stop Docker from using the cache?

As each instruction is examined, Docker looks for an existing image in its cache that it can reuse, rather than creating a new (duplicate) image. If you do not want to use the cache at all, you can use the --no-cache=true option on the docker build command. However, if you do let Docker use its cache,...

Why does Docker invalidate my build cache?

In case a file in the source code changes, the checksum of the copied files changes as well, and therefore, Docker invalidates the build cache. Any subsequent instructions have to be executed again and all npm packages will be re-downloaded. Thus, it is important to identify cacheable units and to split them.

How does the dockerfile instruction work?

Every Dockerfile instruction creates a new intermediate image, which is stored in the Docker cache. When parsing a Dockerfile, Docker carefully examines each instruction and checks if there is a cached intermediate image for the instruction.

What is--no-cache in Docker build command?

docker build --no-cache would invalidate the cache for all the commands. Dockerfile ADD command used to have the cache invalidated. Although it has been improved in recent docker version:


2 Answers

A build-time argument can be specified to forcibly break the cache from that step onwards. For example, in your Dockerfile, put

ARG CACHE_DATE=not_a_date

and then give this argument a fresh value on every new build. The best, of course, is the timestamp.

docker build --build-arg CACHE_DATE=$(date +%Y-%m-%d:%H:%M:%S) ...

Make sure the value is a string without any spaces, otherwise docker client will falsely take it as multiple arguments.

See a detailed discussion on Issue 22832.

like image 50
Ruifeng Ma Avatar answered Oct 10 '22 19:10

Ruifeng Ma


docker build --no-cache would invalidate the cache for all the commands.

Dockerfile ADD command used to have the cache invalidated. Although it has been improved in recent docker version:

Docker is supposed to checksum any file added through ADDand then decide if it should use the cache or not.

So if the file added has changed, the cache should be invalidated for the ADD command.


Issue 1326 mentions other tips:

This worked.

RUN yum -y install firefox #redo

So it looks like Docker will re-run the step (and all the steps below it) if the string I am passing to RUN command changes in anyway - even it's just a comment.

The docker cache is used only, and only if none of his ancestor has changed (this behavior makes sense, as the next command will add change to the previous layer).

The cache is used if there isn't any character which has changed (so even a space is enough to invalidate a cache).

like image 35
VonC Avatar answered Oct 10 '22 18:10

VonC