Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Circle CI Docker service does not cache COPY

I'm running docker on CircleCI and I'm having trouble caching COPY commands.

The Circle CI docs mention known caching issues and recommend using this perl script to set the timestamps on the file copied over to preserve cache.

The Docker best practice docs state:

In the case of the ADD and COPY instructions, the contents of the file(s) being put into the image are examined. Specifically, a checksum is done of the file(s) and then that checksum is used during the cache lookup.

As per the CircleCi recommendations, I am saving the cache to disk then loading it again on the next test run. This seems to be working as commands prior to COPY cache correctly.

To debug, I'm outputting the md5 checksum of the file I am trying to copy locally, then from the docker container and it matches correctly. So, in theory the cache should load. I am not sure Docker uses md5 as a checksum.

This is my current circle.yml:

machine: services: - docker

dependencies:
  cache_directories:
    - "~/docker"
  pre:
    - mkdir -p ~/docker
  override:
    - docker info
    - if [[ -e ~/docker/image.tar ]]; then docker load -i ~/docker/image.tar; fi
    - docker images
    - docker build -t circles .

checkout:
  post:
    - ls -l
    - ./timestamp-set-to-git.pl
    - ls -l

test:
  override:
    - md5sum .bowerrc
    - docker run circles md5sum .bowerrc
    - docker save circles > ~/docker/image.tar

This is what the build outputs for the checksum steps:

$md5sum .bowerrc
8d1a712721d735bd41bf738cae3226a2 .bowerrc

$docker run circles md5sum .bowerrc
8d1a712721d735bd41bf738cae3226a2 .bowerrc

But the docker build reports this:

Step 6 : RUN sudo npm install -g phantomjs gulp
 ---> Using cache
 ---> a7bbf2b17977
Step 7 : COPY .bowerrc /var/work/.bowerrc
 ---> 7ad82336de64

Does anyone know why COPY is not caching?

like image 798
Rimian Avatar asked Jan 13 '15 23:01

Rimian


2 Answers

Docker uses a TARSUM to decide whether to use the Cache, and this includes file metadata. Modified time most importantly... running a git clone will force it to rebuild from scratch.

To work around this, I use a Makefile with the following target...

build: hack-touch
    docker build -t MYTAG .
hack-touch:
    @echo "Reset timestamps on git working directory files..."
    find conf | xargs touch -t 200001010000.00
    touch -t 200001010000.00 Gruntfile.js bower.json package.json .bowerrc

(In my case, everything that I want to be cached like requirements.txt files are in conf except for the Gruntfile stuff on the second line. None of my actual source code I want to be cached)

like image 143
Paul Becotte Avatar answered Sep 28 '22 20:09

Paul Becotte


I ran into the same problem using drone.io (another CI tool).

The reason why this happens is that 'git clone' will (over)write all local files, which will then also get the timestamp of the time of that clone. Since Docker takes the hash of all files added in a COPY or ADD command, this hash is now different from the one before. Docker then invalidates that cache and redoes that step and those after.

like image 32
dhr_p Avatar answered Sep 28 '22 21:09

dhr_p