Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing Azure DevOps docker pipeline using cached layers

I'm trying to optimize build time in my azure devops pipeline, but the npm install stage in my dockerfile just will not cache. Why?

This is my dockerfile. I've separated copying the package*.json files and npm install into it's own layer before copying the rest of my sources, as this is best practice and should make the npm install layer be cacheable between builds.

FROM node:12-alpine3.12 AS builder
WORKDIR /app
ARG VERSION

COPY package.json ./
COPY package-lock.json ./
RUN npm install

COPY . .
RUN npm run build

  ...
FROM node:12-alpine3.12
COPY --from=builder /dist .
  ...

This is my build pipeline. Since Azure builds on a clean vm each time, I've tried to pull down existing images in order to take advantage of previous build caching (ref: How to Enable Docker layer caching in Azure DevOps).

- script: |
    registry=myregistry.azurecr.io
    image=${registry}/myApp:$(Build.SourceBranchName)

    # Pull in previously built builder image because cache
    docker pull ${image}-builder
    # Build the builder target
    docker build \
      --target builder \
      --cache-from ${image}-builder \
      -t ${image}-builder \
      --build-arg VERSION=$(Build.BuildNumber) \
      -f apps/myApp/Dockerfile .

    # Pull in previously built image because cache
    docker pull ${image}
    docker build \
      --cache-from ${image}-builder \
      --cache-from ${image} \
      -t ${image} \
      --build-arg VERSION=$(Build.BuildNumber) \
      -f apps/myApp/Dockerfile .

    docker push ${image}
    docker push ${image}-builder
  displayName: Build and push an image

As you can se, I've separated each stage in my Dockerfile with their own stage in my pipeline. One to build the "builder" stage, and one to build out the resulting image. The docker image from each stage is pushed to my container registry. On rebuilds, or builds where package.json has not changed, I would expect the npm install layer to output ---> Using cache, but it never does when running the "builder" stage.

Step 1/8 : FROM node:12-alpine3.12 AS builder
12-alpine3.12: Pulling from library/node
188c0c94c7c5: Already exists
c4e63f2c1114: Already exists
74bf6ceff101: Already exists
1f6472fc624b: Already exists
Digest: sha256:f2e453020045d7d93790777bc3ce2c992f097ce9a6d577d73490093df93b0702
Status: Downloaded newer image for node:12-alpine3.12
 ---> ccd680d0b809
Step 2/8 : WORKDIR /app
 ---> Using cache
 ---> 9f88e2fda996
Step 3/8 : ARG VERSION
 ---> Using cache
 ---> 707e936abbc5
Step 4/8 : COPY package.json ./
 ---> Using cache
 ---> 034785fd08a7
Step 5/8 : COPY package-lock.json ./
 ---> Using cache
 ---> ab778dbabb01
Step 6/8 : RUN npm install
 ---> Running in df1dc4b5bf91
    ...
Removing intermediate container df1dc4b5bf91
 ---> 4ee43e4f6095
Step 7/8 : COPY . .
 ---> 9ea6540727f2
Step 8/8 : RUN npm run build
 ---> Running in bd65f90191a5

Please note the Removing intermediate container df1dc4b5bf91 above. It might have something to do with the problem? Allthough, I did try to docker build --rm=false, and it still did not use cached layer on rebuild. It does, however, run from cache when building out the last stage of my pipeline:

Step 1/16 : FROM node:12-alpine3.12 AS builder
 ---> ccd680d0b809
Step 2/16 : WORKDIR /app
 ---> Using cache
 ---> 9f88e2fda996
Step 3/16 : ARG VERSION
 ---> Using cache
 ---> 707e936abbc5
Step 4/16 : COPY package.json ./
 ---> Using cache
 ---> 034785fd08a7
Step 5/16 : COPY package-lock.json ./
 ---> Using cache
 ---> ab778dbabb01
Step 6/16 : RUN npm install
 ---> Using cache
 ---> 4ee43e4f6095

What am I missing?

like image 868
Øystein Amundsen Avatar asked Nov 01 '20 01:11

Øystein Amundsen


People also ask

Are Docker layers cached?

Each layer contains the filesystem changes to the image for the state before the execution of the command and the state after the execution of the command. Docker uses a layer cache to optimize and speed up the process of building Docker images.

How do you reduce build time on Azure DevOps?

Azure DevOps ServicesPipeline caching can help reduce build time by allowing the outputs or downloaded dependencies from one run to be reused in later runs, thereby reducing or avoiding the cost to recreate or redownload the same files again.

What is leverage caching in Docker?

Leverage build cache As each instruction is examined, Docker looks for an existing image in its cache that it can reuse, rather than creating a new (duplicate) image. If you do not want to use the cache at all, you can use the --no-cache=true option on the docker build command.


1 Answers

Solved it!

The problem here is the ARG keyword in the Dockerfile. It will allways change, thus create a layer which cannot be cached and therefore changing the hash for other layers below.

From the Docker docs: https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact

ARG is the only instruction that may precede FROM in the Dockerfile

ARG VERSION

FROM node:12-alpine3.12 AS builder
WORKDIR /app

COPY package.json ./
COPY package-lock.json ./
RUN npm install

COPY . .
RUN npm run build

  ...
FROM node:12-alpine3.12
COPY --from=builder /dist .
RUN if [ "x$VERSION" = "x" ] ; then echo "VERSION not set" ; else echo "$VERSION" > ./assets/version.txt ; fi
  ...

By placing the ARG first in Dockerfile, the docker build context will still receive it, but it will not become a layer and ruin caching.

like image 157
Øystein Amundsen Avatar answered Oct 05 '22 23:10

Øystein Amundsen