Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Migrate Django model to unique_together constraint

I have a model with three fields

class MyModel(models.Model):
    a    = models.ForeignKey(A)
    b    = models.ForeignKey(B)
    c    = models.ForeignKey(C)

I want to enforce a unique constraint between these fields, and found django's unique_together, which seems to be the solution. However, I already have an existing database, and there are many duplicates. I know that since unique_together works at the database level, I need to unique-ify the rows, and then try a migration.

Is there a good way to go about removing duplicates (where a duplicate has the same (A,B,C)) so that I can run migration to get the unique_together contstraint?

like image 848
jkeesh Avatar asked Nov 24 '11 07:11

jkeesh


People also ask

What is difference between migrate and Makemigrations in Django?

migrate , which is responsible for applying and unapplying migrations. makemigrations , which is responsible for creating new migrations based on the changes you have made to your models.

How do I fix migration issues in Django?

The first step would be to reverse both the migrations using Django migrate command. python manage.py migrate books 0002 would simply reverse all the migrations which were performed after 0002 step. You can verify this by opening the django_migrations table.


1 Answers

If you are happy to choose one of the duplicates arbitrarily, I think the following might do the trick. Perhaps not the most efficient but simple enough and I guess you only need to run this once. Please verify this all works yourself on some test data in case I've done something silly, since you are about to delete a bunch of data.

First we find groups of objects which form duplicates. For each group, (arbitrarily) pick a "master" that we are going to keep. Our chosen method is to pick the one with lowest pk

from django.db.models import Min, Count

master_pks = MyModel.objects.values('A', 'B', 'C'
    ).annotate(Min('pk'), count=Count('pk')
    ).filter(count__gt=1
    ).values_list('pk__min', flat=True)

we then loop over each master, and delete all its duplicates

masters = MyModel.objects.in_bulk( list(master_pks) )

for master in masters.values():
    MyModel.objects.filter(a=master.a, b=master.b, c=master.c
        ).exclude(pk=master.pk).del_ACCIDENT_PREVENTION_ete()
like image 68
second Avatar answered Sep 20 '22 12:09

second