Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django: How to delete duplicate rows based on 2 columns?

I found out in this answer I can easily delete duplicate rows (duplication based on N columns) in a table with raw SQL.

Is there an equivalence using Django ORM ? The only stuff I found in Django concerned duplicated based on 1 column only.

Note : I know there is a way to prevent future duplicates (based on several fields) in Django, using unique_together field (but I didn't know before).

Thanks.

like image 697
David D. Avatar asked Dec 14 '22 11:12

David D.


1 Answers

A direct translation from the SQL in the other answer into Django ORM:

from django.db.models import Min
# First select the min ids
min_id_objects = MyModel.objects.values('A', 'B').annotate(minid=Min('id'))
min_ids = [obj['minid'] for obj in min_id_objects]
# Now delete 
MyModel.objects.exclude(id__in=min_ids).delete()

This will results in 2 separate SQL queries instead of the one nested SQL provided in the other answer. But I think this is good enough.

like image 124
NeoWang Avatar answered Jan 22 '23 13:01

NeoWang