I have database migrations which I'd like to run before deploying a new version of my app into a Kubernetes cluster. I want these migrations to be run automatically as part of a Continuous Delivery pipeline. The migration will be encapsulated as a container image. What's the best mechanism to achieve this?
Requirements for a solution:
I had assumed that the Jobs functionality in Kubernetes would make this easy, but there appear to be a few challenges:
restartPolicy
of never
.Would using "bare pods" be a better approach? If so, how might that work?
Run the database migrations first, before you deploy the new code. This means the before code must work with both database schemas, but the after code can assume that the tables have already been added.
Kubernetes assumes that pods can communicate with other pods, regardless of which host they land on. Kubernetes gives every pod its own cluster-private IP address, so you do not need to explicitly create links between pods or map container ports to host ports.
Pods are the smallest deployable units of computing that you can create and manage in Kubernetes. A Pod (as in a pod of whales or pea pod) is a group of one or more containers, with shared storage and network resources, and a specification for how to run the containers.
blocking while waiting on the result of a queued-up job seems to require hand-rolled scripts
This isn't necessary anymore thanks to the kubectl wait
command.
Here's how I'm running db migrations in CI:
kubectl apply -f migration-job.yml kubectl wait --for=condition=complete --timeout=60s job/migration kubectl delete job/migration
In case of failure or timeout, one of the two first CLI commands returns with an erroneous exit code which then forces the rest of the CI pipeline to terminate.
migration-job.yml
describes a kubernetes Job
resource configured with restartPolicy: Never
and a reasonably low activeDeadlineSeconds
.
You could also use the spec.ttlSecondsAfterFinished
attribute instead of manually running kubectl delete
but that's still in alpha at the time of writing and not supported by Google Kubernetes Engine at least.
You could try to make both the migration jobs and app independent of each other by doing the following:
Combining these two design approaches, you should be able to develop and execute the migration jobs and app independently of each other and not have to introduce any temporal coupling.
Whether this idea is actually reasonable to implement depends on more specific details of your case, such as the complexity of your database migration efforts. The alternative, as you mentioned, is to simply deploy unmanaged pods into the cluster that do the migration. This requires a bit more wiring as you will need to regularly check the result and distinguish between successful and failed outcomes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With