Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Docker build not using cache when copying Gemfile while using --cache-from

Tags:

On my local machine, I have built the latest image, and running another docker build uses cache everywhere it should.

Then I upload the image to the registry as the latest, and then on my CI server, I'm pulling the latest image of my app in order to use it as the build cache to build the new version :

docker pull $CONTAINER_IMAGE:latest  docker build --cache-from $CONTAINER_IMAGE:latest \              --tag $CONTAINER_IMAGE:$CI_COMMIT_SHORT_SHA \              . 

From the build output we can see the COPY of the Gemfile is not using the cache from the latest image, while I haven't updated that file :

Step 15/22 : RUN gem install bundler -v 1.17.3 &&     ln -s /usr/local/lib/ruby/gems/2.2.0/gems/bundler-1.16.0 /usr/local/lib/ruby/gems/2.2.0/gems/bundler-1.16.1  ---> Using cache  ---> 47a9ad7747c6 Step 16/22 : ENV BUNDLE_GEMFILE=$APP_HOME/Gemfile     BUNDLE_JOBS=8  ---> Using cache  ---> 1124ad337b98 Step 17/22 : WORKDIR $APP_HOME  ---> Using cache  ---> 9cd742111641 Step 18/22 : COPY Gemfile $APP_HOME/  ---> f7ff0ee82ba2 Step 19/22 : COPY Gemfile.lock $APP_HOME/  ---> c963b4c4617f Step 20/22 : RUN bundle install  ---> Running in 3d2cdf999972 

Aside node : It is working perfectly on my local machine.

Looking at the Docker documentation Leverage build cache doesn't seem to explain the behaviour here as neither the Dockerfile, nor the Gemfile has changed, so the cache should be used.

What could prevent Docker from using the cache for the Gemfile?

Update

I tried to copy the files setting the right permissions using COPY --chown=user:group source dest but it still doesn't use the cache.

Opened Docker forum topic: https://forums.docker.com/t/docker-build-not-using-cache-when-copying-gemfile-while-using-cache-from/69186

like image 578
ZedTuX Avatar asked Feb 07 '19 13:02

ZedTuX


People also ask

Does Docker build cache?

Docker's build-cache is a handy feature. It speeds up Docker builds due to reusing previously created layers. You can use the --no-cache option to disable caching or use a custom Docker build argument to enforce rebuilding from a certain step.

Where does Docker store build cache?

In a default install, these are located in /var/lib/docker. During a new build, all of these file structures have to be created and written to disk — this is where Docker stores base images. Once created, the container (and subsequent new ones) will be stored in the folder in this same area.

When should I use Docker without cache?

If you have multiple docker images using the same base image, this is the best choice to go with. In the first phase your dockerfile will source the mainstream image and only include the OS updates layer. You will then use the –pull –no-cache docker command-line switches on this phase only.

Does Docker pull cache?

Pulling cached imagesThe Docker daemon checks the Container Registry cache and fetches the images if it exists. If your daemon configuration includes other Docker mirrors, the daemon checks each one in order for a cached copy of the image.


1 Answers

I've been scratching my head over issues with Docker build and --cache-from for the last few days, and it's a bit frustrating the lack of documentation for the proper behavior of --cache-from, while there is some misinformation in the wild.

I think I've finally managed to fix the issues I had on my side, after a few insights which I'm going to share here in the hopes it will be useful to someone else.

When providing multiple --cache-from, the order matters!

The order is very important, because at the first match, Docker will stop looking for other matches and it will use that one for all the rest of the commands.

This is explained by the fellow who implemented the feature in the Github PR:

When using multiple --cache-from they are checked for a cache hit in the order that user specified. If one of the images produces a cache hit for a command only that image is used for the rest of the build.

There is also a lenghtier explanation in the initial ticket proposal:

Specifying multiple --cache-from images is bit problematic. If both images match there is no way(without doing multiple passes) to figure out what image to use. So we pick the first one(let user control the priority) but that may not be the longest chain we could have matched in the end. If we allow matching against one image for some commands and later switch to a different image that had a longer chain we risk in leaking some information between images as we only validate history and layers for cache. Currently I left it so that if we get a match we only use this target image for rest of the commands.

Using --cache-from is exclusive: the local Docker cache won't be used

This means that it doesn't add new caching sources, the image tags you provide will be the only caching sources for the Docker build.

Even if you just built the same image locally, the next time you run docker build for it, in order to benefit from the cache, you need to either:

  1. provide the correct tag with --cache-from (and with the correct precedence); or

  2. not use --cache-from at all (so that it will use the local build cache)

If the parent image changes, the cache will be invalidated

For example, if you have an image based on docker:stable, and docker:stable gets updated, the cached builds of your image will not be valid anymore as the layers of the base image were changed.

This is why, if you're configuring a CI build, it can be useful to docker pull the base image as well and include it in the --cache-from, as mentioned in this comment in yet another Github discussion.

like image 55
Elias Dorneles Avatar answered Oct 03 '22 22:10

Elias Dorneles