Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django on Kubernetes Deployment: Best practices for DB Migrations

My Django deployment has x number of pods (3 currently)running a Django backend REST API server. We're still in the development/staging phase. I wanted to ask for advice regarding DB migration. Right now the pods simply start by launching the webserver, assuming the database is migrated and ready. This assumption can be wrong of course.

Can I simply put python manage.py migrate before running the server? What happens if 2 or 3 pods are started at the same time and all run migrations at the same time? would there be any potential damage or problem from that? Is there a best practice pattern to follow here to ensure that all pods start the server with a healthy migrated database?

I was thinking about this:

During initial deployment, define a Kubernetes Job object that'll run once, after the database pod is ready. It will be using the same Django container I have, and will simply run python manage.py migrate. the script that deploys will kubectl wait for that job pod to finish, and then apply the yaml that creates the full Django deployment. This will ensure all django pods "wake up" with a database that's fully migrated.

In subsequent updates, I will run the same job again before re-applying the Django deployment pod upgrades.

Now there is a question of chicken and egg and maintaining 100% uptime during migration, but this is a question for another post: How do you apply data migrations that BREAK existing container Version X when the code to work with the new migrations is updated in container Version X+1. Do you take the entire service offline for the duration of the update? is there a pattern to keep service up and running?

like image 333
JasonGenX Avatar asked Feb 04 '20 14:02

JasonGenX


1 Answers

Well you are right about the part that multiple migrate commands will run against your database by multiple pods getting started.

But this will not cause any problems. When you are going to make actual changes to your database, if the changes are already applied, your changes will be ignored. So, say 3 pods start at the same time and run the migrate command. Only One of those commands will end up applying changes to the database. Migrations normally need to lock the database for different actions (this is highly related to your DBMS). The lock will happen by one of the migrate commands (one of the pods) and other commands should wait until the work of the first one is over. After the job is done by the first one, others' commands will be ignored automatically. So each migration will happen once.

You can however, change your deployment strategy and ask kubernetes to first, spin up only 1 pod and when the first pod's health check succeeds, others will spin up too. In this case, you can be sure that the lock time for the migration, will happen only once and others will just check that migrations are already applied and ignore them automatically.

like image 88
nima Avatar answered Sep 20 '22 17:09

nima