Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Docker hub image cache doesn't seem to be working

We have a continuous integration pipeline on circleci that does the following:

  1. Loads repo/image:mytag1 from the cache directory to be able to use cached layers
  2. Builds a new version: docker build -t repoimage:mytag2
  3. Saves the new version to the cache directory with docker save
  4. Runs tests
  5. Pushes to docker hub: docker push repo/image:mytag2

The problem is with step 5. The push step takes 5 minutes every time. If I understand it correctly, docker hub is meant to cache layers so we don't have to re-push things like the base image and dependencies if they are not updated.

I ran the build twice in a row, and I see a lot of crossover in the hash of the layers being pushed. Yet rather than "Image already exists" I see "Image successfully pushed".

Here's the output of build 1's docker push, and here's build 2

If you diff those two files you'll see that only 2 layers differ in each build:

< ca44fed88be6: Buffering to Disk
< ca44fed88be6: Image successfully pushed
< 5dbd19bfac8a: Buffering to Disk
< 5dbd19bfac8a: Image successfully pushed
---
> 9136b10cfb72: Buffering to Disk
> 9136b10cfb72: Image successfully pushed
> 0388311b6857: Buffering to Disk
> 0388311b6857: Image successfully pushed

So why is it that all the images have to re-push every time?

like image 974
jtmarmon Avatar asked Jan 19 '16 21:01

jtmarmon


People also ask

Does docker pull cache?

Pulling cached imagesThe Docker daemon checks the Container Registry cache and fetches the images if it exists. If your daemon configuration includes other Docker mirrors, the daemon checks each one in order for a cached copy of the image.

Does Docker Hub store images or containers?

Users get access to free public repositories for storing and sharing images or can choose a subscription plan for private repositories. Docker Hub provides the following major features: Repositories: Push and pull container images. Teams & Organizations: Manage access to private repositories of container images.

What is pull through cache?

Pull through cache repositories provide the benefits of the built-in security capabilities in Amazon Elastic Container Registry, such as AWS PrivateLink enabling you to keep all of the network traffic private, image scanning to detect vulnerabilities, encryption with AWS Key Management Service (AWS KMS) keys, cross- ...

How do I push into Docker Hub?

To push an image to Docker Hub, you must first name your local image using your Docker Hub username and the repository name that you created through Docker Hub on the web. You can add multiple images to a repository by adding a specific :<tag> to them (for example docs/base:testing ).


1 Answers

Using a different tag creates a different image which, when pushed, cannot rely on the cache.

For example the two commands:

$ docker commit -m "thing" -a "me" db65bf421f96 me/thing:v1
$ docker commit -m "thing" -a "me" db65bf421f96 me/thing:v2

yield utterly distinctimages even though they were created from identical images (db65bf421f96). When pushed, dockerhub must treat them as completely separate images as can be seen with:

$ docker images
REPOSITORY     TAG      IMAGE ID
me/thing       v2       f14aa8ac6bae
me/thing       v1       c7d72ccc1d71

The image IDs are unique and thus the images are unique even only if they vary in tags.

You could say "docker should recognize them as being bit for bit identical" and thus treat them as cachable. But it doesn't (yet).

The only surprise for me in your example is that you got any duplicate image IDs at all.

Authoritative (if less explanatory) documentation can be found at docker in "Build your own images".

like image 82
msw Avatar answered Oct 21 '22 16:10

msw