Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I filter these Django records?

I have a set of Django models as shown in the following diagram (the names of the reverse relationships are shown in the yellow bubbles):

Django models
(source: cbstaff.com)

In each relationship, a Person may have 0 or more of the items.

Additionally, the slug field is (unfortunately) not unique; multiple Person records may have the same slug fields. Essentially these records are duplicates.

I want to obtain a list of all records that meet the following criteria: All duplicate records (that is, having the same slug) with at least one Entry OR at least one Audio OR at least one Episode OR at least one Article.

So far, I have the following query:

Person.objects.values('slug').annotate(num_records=Count('slug')).filter(num_records__gt=1)

This groups all records by slug, then adds a num_records attribute that says how many records have that slug, but the additional filtering is not performed (and I don't even know if this would work right anyway, since, given a set of duplicate records, one may have, e.g., and Entry and the other may have an Article).

In a nutshell, I want to find all duplicate records and collapse them, along with their associated models, into one record.

What's the best way to do this with Django?

like image 227
mipadi Avatar asked Jun 15 '10 19:06

mipadi


People also ask

How can I filter a Django query with a list of values?

To filter a Python Django query with a list of values, we can use the filter method with in . to search Blog entries with pk set to 1,4 or 7 by calling Blog. objects. filter with the pk_in argument set to [1, 4, 7] .

Can you filter by property Django?

Nope. Django filters operate at the database level, generating SQL. To filter based on Python properties, you have to load the object into Python to evaluate the property--and at that point, you've already done all the work to load it.

What is __ In Django filter?

filter(tags__in=tags) matches photos that have any of the tags, not only those that has all. Some of those that only has one of the desired tags, may have exactly the amount of tags that you are looking for, and some of those that has all the desired tags, may also have additional tags.

How does Django filter work?

Django-filter is a generic, reusable application to alleviate writing some of the more mundane bits of view code. Specifically, it allows users to filter down a queryset based on a model's fields, displaying the form to let them do this.


1 Answers

I would do this in several queries. The first is your list of duplicates, that you have:

dupes = [p['slug'] for p in Person.objects.values('slug').annotate(num_records=Count('slug')).filter(num_records__gt=1)]

I would then loop through these, and for each one decide on which to keep (make an arbitrary decision - pick the first one). Then, for all the other primary keys, just update all the other objects to point to the primary key you have selected:

for slug in dupes:
    pks = [p.id for p in Person.objects.filter(slug=slug)]
    for pk in pks[1:]:
        Audio.objects.filter(person=pk).update(person=pks[0])
        Author.objects.filter(person=pk).update(person=pks[0])
        Episode.objects.filter(person=pk).update(person=pks[0])
        Entry.objects.filter(person=pk).update(person=pks[0])
like image 182
spookylukey Avatar answered Oct 12 '22 23:10

spookylukey