Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django migration being killed

I'm quite confident with Django, but have mostly relied on generated migrations until recently. I wrote a small custom migration, and shortly after my CI started complaining about timeouts and it ends up that it had something to do with the migrations from Django during deploy.

At first, I was able to fix this issue, but I don't know what I did (if anything) that repaired it. The issue seems to be related to some custom code I was entering for a specific migration. Here's what I know:

  • Initially, all was fine, but the migrations started taking a really long time to run (relatively) after adding my custom code. About 10 seconds at time.
  • It works sometimes. ie. If I run the migration ten times from the command line, sometimes it will work and sometimes it will fail.

The output is as follows (app names edited out):

[web@dev myapp]$ ./manage.py migrate
Operations to perform:
  Apply all migrations: myapp1, myapp2, myapp3, myapp4
Running migrations:
Killed
  • At first I thought it was because I am using RunPython to run a Python function that copies data between two fields before deleting one of the fields. The documentation discourages its use for PostgreSQL, but is there a better way to do this?
  • The business scenario here is that I had a boolean field that I needed to switch to a set of options (CharField with options). The code checks if the boolean was true and sets the correct value for the character field. I have done this twice. The first time ended up working eventually, but I haven't tested it on another database yet.

This is the migration (app names edited out):

from __future__ import unicode_literals

from django.db import migrations

def fix_consulting(apps, schema_editor):
    my_model = apps.get_model("myapp", "MyModel")
    for m in my_model._default_manager.all():
        if m.consulting:
            m.detail = "CONSLT"
            m.save()


class Migration(migrations.Migration):

    dependencies = [
        ('myapp', '0024_auto_20160117_1113'),
    ]

    operations = [
        migrations.RunPython(fix_consulting,atomic=False),
    ]

My thoughts:

  • Maybe the code I'm writing here is taking too long to run? There are less than one hundred models in the database so I don't know why the fix_consulting function would take so long.

  • If I add print statements at the beginning of fix_consulting, they only run sometimes, and are killed other times. As it stands, I've ran it 6-8 times and it has been killed every time, but at different points

Other information: - Using Django 1.9 - Using PostgreSQL 9.4.4 - Error occurs mostly on CentOS, but also OSX

like image 519
Jamie Counsell Avatar asked Jan 17 '16 21:01

Jamie Counsell


1 Answers

I believe your issue was caused by the amount of data that you may need to cache when using all since this returns all instances of an object, therefore you can do the filtering on a database level before returning the objects, since you then only need to change values of a field you may as well do that on a database level too. Altogether this will change your code to the following.

def fix_consulting(apps, schema_editor):
    my_model = apps.get_model("myapp", "MyModel")
    my_model._default_manager.filter(consulting=True).update(detail="CONSLT")

This puts the memory management responsibilities on the database which appears to have solved your issue.

Going forward, I'd recommend trying to always filter down what is returned from a db to only that that is actually needed (be that by splicing or filtering)

like image 151
Sayse Avatar answered Oct 16 '22 07:10

Sayse