Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I see which file(s) caused a Dockerfile `COPY` statement to invalidate the cache?

docker build . will rebuild the docker image given the Dockerfile in the current directory, and ignore any paths matched from the .dockerignore file.

Any COPY statements in that Dockerfile will cause the build cache to be invalidated if the files on-disk are different from last time it built.

I've noticed that if you don't ignore the .git dir, simple things like git fetch which have no side-effect will cause the build cache to become invalidated (presumably because some tracking information within the .git dir has changed.

It would be very helpful if I knew how to see precisely which files caused the cache to become invalidated... But I've been unable to find a way.

like image 527
Dean Rather Avatar asked Aug 30 '16 02:08

Dean Rather


People also ask

How do I stop Docker from caching?

Disabling caching You can do so by passing two arguments to docker build : --pull : This pulls the latest version of the base Docker image, instead of using the locally cached one. --no-cache : This ensures all additional layers in the Dockerfile get rebuilt from scratch, instead of relying on the layer cache.

Where is Docker build cache stored?

The cache uses the same storage driver as used for image layers. Metadata is stored in databases at /var/lib/docker/buildkit . When using overlay2 driver the layer itself is in /var/lib/docker/overlay2/<ID>/diff/ . For <ID> , see below. /var/lib/docker can vary depending on data-root in your dockerd configuration.

Does Docker cache copy?

About Layer Caching in Docker Docker uses a layer cache to optimize and speed up the process of building Docker images. Docker Layer Caching mainly works on the RUN , COPY and ADD commands, which will be explained in more detail next.

Are Docker images cached?

The concept of Docker images comes with immutable layers. Every command you execute results in a new layer that contains the changes compared to the previous layer. All previously built layers are cached and can be reused.


1 Answers

I don't think there is a way to see which file invalidated the cache with the current Docker image design.

Layers and images since v1.10 are 'content addressable'. Their ID's are based on a SHA256 checksum which reflects their content.

The caching code just looks up the ID of the image/layer which will only exist in Docker Engine if the contents of the entire layer match (or possibly a collision).

So when you run docker build, a new build context is created for each command in the Dockerfile. A checksum is calculated for the entire layer that command would produce. Then docker checks to see if an existing layer is available with that checksum and run config.

The only way I can see to get individual file detail back would be to recompute the destination file checksums, which would probably negate most of the caching speed up. If you did want to do this anyway, the other problem is deciding which layer to check that against. You would have to lookup a previous image build tree (maybe by tag?) to find what the contents of the previous comparable layer were.

like image 103
Matt Avatar answered Oct 23 '22 13:10

Matt