Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to cache multi-stage docker build in google cloud builder

I have docker multistage build, for example:

FROM golang:1.7.3
WORKDIR /go/src/github.com/alexellis/href-counter/
RUN go get -d -v golang.org/x/net/html  
COPY app.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

FROM alpine:latest  
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /go/src/github.com/alexellis/href-counter/app .
CMD ["./app"]  

Than I have cloudbuild.yml:

steps:
- name: 'gcr.io/cloud-builders/docker'
  args: ['pull', 'gcr.io/$PROJECT_ID/app:$BRANCH_NAME']
- name: 'gcr.io/cloud-builders/docker'
  args: ['pull', 'gcr.io/$PROJECT_ID/app:latest']
- name: 'gcr.io/cloud-builders/docker'
  args: [
            'build',
            '--cache-from', 'gcr.io/$PROJECT_ID/app:latest',
            '--cache-from', 'gcr.io/$PROJECT_ID/app:$BRANCH_NAME',
            '--build-arg', 'COMMIT_HASH=$COMMIT_SHA',
            '-t', 'gcr.io/$PROJECT_ID/app:$COMMIT_SHA',
            '-f', 'config/dockerfiles/app.dockerfile',
            '.'
        ]
- name: 'gcr.io/cloud-builders/docker'
  args: ["tag", "gcr.io/$PROJECT_ID/app:$COMMIT_SHA", "gcr.io/$PROJECT_ID/app:$BRANCH_NAME"]
- name: 'gcr.io/cloud-builders/docker'
  args: ["tag", "gcr.io/$PROJECT_ID/app:$COMMIT_SHA", "gcr.io/$PROJECT_ID/app:latest"]
images: [
  'gcr.io/$PROJECT_ID/app:$COMMIT_SHA',
  'gcr.io/$PROJECT_ID/app:$BRANCH_NAME',
  'gcr.io/$PROJECT_ID/app:latest'
]

Now I want to cache not only the resulting image but also the builder step. For example, in go I have /vendor which I construct using dep, and would like to cache those dependencies. How would I acchieve that the easiest with google cloud platform? I think my question is mostly docker specific, but still.

like image 291
nmiculinic Avatar asked May 09 '18 09:05

nmiculinic


People also ask

How do I use multi stage builds in dockerfile?

Use multi-stage builds. With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don’t want in the final image.

What are the most challenging things about building images in Docker?

Acknowledgment : Special thanks to Alex Ellis for granting permission to use his blog post Builder pattern vs. Multi-stage builds in Docker as the basis of the examples below. One of the most challenging things about building images is keeping the image size down.

How do I copy a dockerfile from one stage to another?

When using multi-stage builds, you are not limited to copying from stages you created earlier in your Dockerfile. You can use the COPY --from instruction to copy from a separate image, either using the local image name, a tag available locally or on a Docker registry, or a tag ID.

How to use buildkit with Docker build command?

This involves passing the argument --build-arg BUILDKIT_INLINE_CACHE=1 to your docker build command. You will also need to ensure BuildKit is being used by setting the environment variable DOCKER_BUILDKIT=1 (on Linux; I think BuildKit might be the default backend on Windows when using recent versions of Docker Desktop).


1 Answers

The builder image needs to be build and tagged separately. You need to push that image from the build stage and use it in the next builds as cache. For that, is more convenient to name your build stage.

FROM golang:1.7.3 as builder
WORKDIR /go/src/github.com/alexellis/href-counter/
RUN go get -d -v golang.org/x/net/html  
COPY app.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

FROM alpine:latest  
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /go/src/github.com/alexellis/href-counter/app .
CMD ["./app"]

In your cloudbuild.yaml you need to know which image you should pull to perform a better cache usage, and "store" that decision somewhere. I will show you how it can be done by storing in a file.

It's easier if you keep your logic in one build step:

steps:
- name: 'gcr.io/cloud-builders/docker'
  entrypoint: 'bash'
  args:
    - '-c'
    - |
      mkdir tmp
      (docker pull gcr.io/$PROJECT_ID/app:$BRANCH_NAME && echo "$BRANCH_NAME" > tmp/base) ||
        echo "master" > tmp/base

      docker pull "us.gcr.io/$PROJECT_ID/app-builder:$(cat tmp/base)" || true
      docker pull "us.gcr.io/$PROJECT_ID/app:$(cat tmp/base)" || true

      docker build \
          --cache-from "gcr.io/$PROJECT_ID/app-builder:$(cat tmp/base)" \
          -t us.gcr.io/$PROJECT_ID/app-builder:$BRANCH_NAME \
          -t us.gcr.io/$PROJECT_ID/app-builder:$COMMIT_SHA \
          -t us.gcr.io/$PROJECT_ID/app-builder:latest \
          --build-arg COMMIT_HASH=$COMMIT_SHA \
          -f config/dockerfiles/app.dockerfile \
          --target builder \
          .

      docker build \
          --cache-from "gcr.io/$PROJECT_ID/app-builder:$COMMIT_SHA" \
          --cache-from "gcr.io/$PROJECT_ID/app:$(cat tmp/base)" \
          -t us.gcr.io/$PROJECT_ID/app:$BRANCH_NAME \
          -t us.gcr.io/$PROJECT_ID/app:$COMMIT_SHA \
          -t us.gcr.io/$PROJECT_ID/app:latest \
          --build-arg COMMIT_HASH=$COMMIT_SHA \
          -f config/dockerfiles/app.dockerfile \
          .
images: [
  'gcr.io/$PROJECT_ID/app-builder:$COMMIT_SHA',
  'gcr.io/$PROJECT_ID/app-builder:$BRANCH_NAME',
  'gcr.io/$PROJECT_ID/app-builder:latest',
  'gcr.io/$PROJECT_ID/app:$COMMIT_SHA',
  'gcr.io/$PROJECT_ID/app:$BRANCH_NAME',
  'gcr.io/$PROJECT_ID/app:latest'
]

The script creates the tag file inside tmp/, so it's important that this directory or file is ignored by Docker (put that on .dockerignore).

Notice that I avoided the use of --cache-from with two images. This is because in my experiments I was getting cache invalidation because the build was using the oldest image as cache. Also observe that the first docker build command has a --target argument. This is telling Docker to build only until the end of that stage.

I changed the default image to master because it guarantee that the base image is stable and is not in a branch too different from yours, which results in better performance. The latest tag is unnecessary in my example.


like image 190
Philip Sampaio Avatar answered Oct 20 '22 12:10

Philip Sampaio