Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't Docker Hub cache Automated Build Repositories as the images are being built?

Tags:

Note: It appears the premise of my question is no longer valid since the new Docker Hub appears to support caching. I haven't personally tested this. See the new answer below.

Docker Hub's Automated Build Repositories don't seem to cache images. As it is building, it removes all intermediate containers. Is this the way it was intended to work or am I doing something wrong? It would be really nice to not have to rebuild everything for every small change. I thought that was supposed to be one of the best advantages of docker and it seems weird that their builder doesn't use it. So why doesn't it cache images?

UPDATE: I've started using Codeship to build my app and then run remote commands on my DigitalOcean server to copy the built files and run the docker build command. I'm still not sure why Docker Hub doesn't cache.

like image 315
Scotty Waggoner Avatar asked Aug 03 '14 06:08

Scotty Waggoner


People also ask

How do I update automated repositories in Docker Hub?

From the Repositories page, click into a repository, and click the Builds tab. Click Configure automated builds to edit the repository's build settings. In the Build Rules section, locate the branch or tag you no longer want to automatically build. Click the autobuild toggle next to the configuration line.

What is leverage caching in Docker?

Leverage build cache As each instruction is examined, Docker looks for an existing image in its cache that it can reuse, rather than creating a new (duplicate) image. If you do not want to use the cache at all, you can use the --no-cache=true option on the docker build command.

What is a pull through cache?

In Ehcache docs a SelfPopulatingCache (= pull-through cache) is described as a: A selfpopulating decorator for Ehcache that creates entries on demand. That means when asking the SelfPopulatingCache for a value and that value is not in the cache, it will create this value for you.


2 Answers

Disclaimer: I am a lead software engineer at Quay.io, a private Docker container registry, so this is an educated guess based on the same problem we faced in our own build system implementation.

Given my experience with Dockerfile build systems, I would suspect that the Docker Hub does not support caching because of the way caching is implemented in the Docker Engine. Caching for Docker builds operates by comparing the commands to be run against the existing layers found in memory.

For example, if the Dockerfile has the form:

FROM somebaseimage
RUN somecommand
ADD somefile somefile

Then the Docker build code will:

  1. Check to see if an image matching somebaseimage exists
  2. Check if there is a local image with the command RUN somecommand whose parent is the previous image
  3. Check if there is a local image with the command ADD somefile somefile + a hashing of the contents of somefile (to make sure it is invalidated when somefile changes), whose parent is the previous image

If any of the above steps match, then that command will be skipped in the Dockerfile build process, with the cached image itself being used instead. However, the one key issue with this process is that it requires the cached images to be present on the build machine, in order to find and verify the matches. Having all of everyone's images on build nodes would be highly inefficient, making this a harder problem to solve.

At Quay.io, we solved the caching problem by creating a variation of the Docker caching code that could precompute these commands/hashes and then ask our registry for the cached layers, downloading them to the machine only after we had found the most efficient caching set. This required significant data model changes in our registry code.

If you'd like more information, we gave a technical overview into how we do so in this talk: https://youtu.be/anfmeB_JzB0?list=PLlh6TqkU8kg8Ld0Zu1aRWATiqBkxseZ9g

like image 51
Joey Schorr Avatar answered Sep 20 '22 23:09

Joey Schorr


The new Docker Hub came out with a new Automated Build system that supports Build Caching.

https://blog.docker.com/2018/12/the-new-docker-hub/

like image 38
Attila Szeremi Avatar answered Sep 19 '22 23:09

Attila Szeremi