Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django migration 11 million rows, need to break it down

I have a table which I am working on and it contains 11 million rows there abouts... I need to run a migration on this table but since Django trys to store it all in cache I run out of ram or disk space which ever comes first and it comes to abrupt halt.

I'm curious to know if anyone has faced this issue and has come up with a solution to essentially "paginate" migrations maybe into blocks of 10-20k rows at a time?

Just to give a bit of background I am using Django 1.10 and Postgres 9.4 and I want to keep this automated still if possible (which I still think it can be)

Thanks Sam

like image 654
Sam Buckingham Avatar asked May 24 '17 19:05

Sam Buckingham


People also ask

Can I delete all migrations Django?

If you delete the whole folders, you're gonna have to run the makemigrations command mentioning all the app names. That's a hassle if you do this often. To have Django see the apps that need migrations, you'll wanna keep the migrations folder and the __init__.py inside them. For SQLite just delete the DB file.

What is difference between migrate and migration in Django?

You should think of migrations as a version control system for your database schema. makemigrations is responsible for packaging up your model changes into individual migration files - analogous to commits - and migrate is responsible for applying those to your database.

Is Django migration necessary?

Migrations are not required. They can be useful for creating and tracking database changes via code, but Django applications will run properly without them.


1 Answers

The issue comes from a Postgresql which rewrites each row on adding a new column (field).

What you would need to do is to write your own data migration in the following way:

  1. Add a new column with null=True. In this case data will not be rewritten and migration will finish pretty fast.
  2. Migrate it
  3. Add a default value
  4. Migrate it again.

That is basically a simple pattern on how to deal with adding a new row in a huge postgres database.

like image 187
taras Avatar answered Sep 29 '22 07:09

taras