I have docker multistage build, for example:
FROM golang:1.7.3
WORKDIR /go/src/github.com/alexellis/href-counter/
RUN go get -d -v golang.org/x/net/html
COPY app.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /go/src/github.com/alexellis/href-counter/app .
CMD ["./app"]
Than I have cloudbuild.yml:
steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['pull', 'gcr.io/$PROJECT_ID/app:$BRANCH_NAME']
- name: 'gcr.io/cloud-builders/docker'
args: ['pull', 'gcr.io/$PROJECT_ID/app:latest']
- name: 'gcr.io/cloud-builders/docker'
args: [
'build',
'--cache-from', 'gcr.io/$PROJECT_ID/app:latest',
'--cache-from', 'gcr.io/$PROJECT_ID/app:$BRANCH_NAME',
'--build-arg', 'COMMIT_HASH=$COMMIT_SHA',
'-t', 'gcr.io/$PROJECT_ID/app:$COMMIT_SHA',
'-f', 'config/dockerfiles/app.dockerfile',
'.'
]
- name: 'gcr.io/cloud-builders/docker'
args: ["tag", "gcr.io/$PROJECT_ID/app:$COMMIT_SHA", "gcr.io/$PROJECT_ID/app:$BRANCH_NAME"]
- name: 'gcr.io/cloud-builders/docker'
args: ["tag", "gcr.io/$PROJECT_ID/app:$COMMIT_SHA", "gcr.io/$PROJECT_ID/app:latest"]
images: [
'gcr.io/$PROJECT_ID/app:$COMMIT_SHA',
'gcr.io/$PROJECT_ID/app:$BRANCH_NAME',
'gcr.io/$PROJECT_ID/app:latest'
]
Now I want to cache not only the resulting image but also the builder step. For example, in go I have /vendor which I construct using dep, and would like to cache those dependencies. How would I acchieve that the easiest with google cloud platform? I think my question is mostly docker specific, but still.
Use multi-stage builds. With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don’t want in the final image.
Acknowledgment : Special thanks to Alex Ellis for granting permission to use his blog post Builder pattern vs. Multi-stage builds in Docker as the basis of the examples below. One of the most challenging things about building images is keeping the image size down.
When using multi-stage builds, you are not limited to copying from stages you created earlier in your Dockerfile. You can use the COPY --from instruction to copy from a separate image, either using the local image name, a tag available locally or on a Docker registry, or a tag ID.
This involves passing the argument --build-arg BUILDKIT_INLINE_CACHE=1 to your docker build command. You will also need to ensure BuildKit is being used by setting the environment variable DOCKER_BUILDKIT=1 (on Linux; I think BuildKit might be the default backend on Windows when using recent versions of Docker Desktop).
The builder image needs to be build and tagged separately. You need to push that image from the build stage and use it in the next builds as cache. For that, is more convenient to name your build stage.
FROM golang:1.7.3 as builder
WORKDIR /go/src/github.com/alexellis/href-counter/
RUN go get -d -v golang.org/x/net/html
COPY app.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /go/src/github.com/alexellis/href-counter/app .
CMD ["./app"]
In your cloudbuild.yaml
you need to know which image you should pull to perform a better cache usage, and "store" that decision somewhere. I will show you how it can be done by storing in a file.
It's easier if you keep your logic in one build step:
steps:
- name: 'gcr.io/cloud-builders/docker'
entrypoint: 'bash'
args:
- '-c'
- |
mkdir tmp
(docker pull gcr.io/$PROJECT_ID/app:$BRANCH_NAME && echo "$BRANCH_NAME" > tmp/base) ||
echo "master" > tmp/base
docker pull "us.gcr.io/$PROJECT_ID/app-builder:$(cat tmp/base)" || true
docker pull "us.gcr.io/$PROJECT_ID/app:$(cat tmp/base)" || true
docker build \
--cache-from "gcr.io/$PROJECT_ID/app-builder:$(cat tmp/base)" \
-t us.gcr.io/$PROJECT_ID/app-builder:$BRANCH_NAME \
-t us.gcr.io/$PROJECT_ID/app-builder:$COMMIT_SHA \
-t us.gcr.io/$PROJECT_ID/app-builder:latest \
--build-arg COMMIT_HASH=$COMMIT_SHA \
-f config/dockerfiles/app.dockerfile \
--target builder \
.
docker build \
--cache-from "gcr.io/$PROJECT_ID/app-builder:$COMMIT_SHA" \
--cache-from "gcr.io/$PROJECT_ID/app:$(cat tmp/base)" \
-t us.gcr.io/$PROJECT_ID/app:$BRANCH_NAME \
-t us.gcr.io/$PROJECT_ID/app:$COMMIT_SHA \
-t us.gcr.io/$PROJECT_ID/app:latest \
--build-arg COMMIT_HASH=$COMMIT_SHA \
-f config/dockerfiles/app.dockerfile \
.
images: [
'gcr.io/$PROJECT_ID/app-builder:$COMMIT_SHA',
'gcr.io/$PROJECT_ID/app-builder:$BRANCH_NAME',
'gcr.io/$PROJECT_ID/app-builder:latest',
'gcr.io/$PROJECT_ID/app:$COMMIT_SHA',
'gcr.io/$PROJECT_ID/app:$BRANCH_NAME',
'gcr.io/$PROJECT_ID/app:latest'
]
The script creates the tag
file inside tmp/
, so it's important that this directory or file is ignored by Docker (put that on .dockerignore
).
Notice that I avoided the use of --cache-from
with two images. This is because in my experiments I was getting cache invalidation because the build was using the oldest image as cache. Also observe that the first docker build
command has a --target
argument. This is telling Docker to build only until the end of that stage.
I changed the default image to master
because it guarantee that the base image is stable and is not in a branch too different from yours, which results in better performance. The latest
tag is unnecessary in my example.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With