In my Dockerfile
I use curl
or ADD
to download the latest version of an archive like:
FROM debian:jessie
...
RUN apt-get install -y curl
...
RUN curl -sL http://example.com/latest/archive.tar.gz --output archive.tar.gz
...
ADD http://example.com/latest/archive2.tar.gz
...
The RUN
statement that uses curl
or ADD
creates its own image layer. That will be used as a cache for future executions of docker build
.
Question: How can I disable caching for that instructions?
It would be great to get something like cache invalidation working there. E.g. by using HTTP ETags or by querying the last modified header field. That would give the possibility to do a quick check based on the HTTP headers to decide whether a cached layer could be used or not.
I know that some dirty tricks could help e.g. executing a download shell script in the RUN
statement instead. Its filename will be changed before the docker build
is triggered by our build system. And I could do the HTTP checks inside that script. But then I need to store either the last used ETag or the last modified to a file somewhere. I am wondering whether there is some more clean and native Docker functionality that I could use, here.
$ docker build --no-cache -t sample-image:sample-tag . When you execute this command, the daemon will not look for cache builds of existing image layers and will force a clean build of the Docker image from the Dockerfile. If you use Docker compose, you can use the following command.
About Layer Caching in Docker Docker uses a layer cache to optimize and speed up the process of building Docker images. Docker Layer Caching mainly works on the RUN , COPY and ADD commands, which will be explained in more detail next.
Docker's build-cache is a handy feature. It speeds up Docker builds due to reusing previously created layers. You can use the --no-cache option to disable caching or use a custom Docker build argument to enforce rebuilding from a certain step.
As each instruction is examined, Docker looks for an existing image in its cache that it can reuse, rather than creating a new (duplicate) image. If you do not want to use the cache at all, you can use the --no-cache=true option on the docker build command. However, if you do let Docker use its cache,...
In case a file in the source code changes, the checksum of the copied files changes as well, and therefore, Docker invalidates the build cache. Any subsequent instructions have to be executed again and all npm packages will be re-downloaded. Thus, it is important to identify cacheable units and to split them.
Every Dockerfile instruction creates a new intermediate image, which is stored in the Docker cache. When parsing a Dockerfile, Docker carefully examines each instruction and checks if there is a cached intermediate image for the instruction.
docker build --no-cache would invalidate the cache for all the commands. Dockerfile ADD command used to have the cache invalidated. Although it has been improved in recent docker version:
A build-time argument can be specified to forcibly break the cache from that step onwards. For example, in your Dockerfile, put
ARG CACHE_DATE=not_a_date
and then give this argument a fresh value on every new build. The best, of course, is the timestamp.
docker build --build-arg CACHE_DATE=$(date +%Y-%m-%d:%H:%M:%S) ...
Make sure the value is a string without any spaces, otherwise docker client will falsely take it as multiple arguments.
See a detailed discussion on Issue 22832.
docker build --no-cache would invalidate the cache for all the commands.
Dockerfile ADD command used to have the cache invalidated. Although it has been improved in recent docker version:
Docker is supposed to checksum any file added through ADDand then decide if it should use the cache or not.
So if the file added has changed, the cache should be invalidated for the ADD
command.
Issue 1326 mentions other tips:
This worked.
RUN yum -y install firefox #redo
So it looks like Docker will re-run the step (and all the steps below it) if the string I am passing to
RUN
command changes in anyway - even it's just a comment.The docker cache is used only, and only if none of his ancestor has changed (this behavior makes sense, as the next command will add change to the previous layer).
The cache is used if there isn't any character which has changed (so even a space is enough to invalidate a cache).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With