Is there a way to query for subset or superset containment with many-to-many fields?
Suppose each Person has a list of birds they want to see, and each Aviary houses a list of birds. How can I make a query to find, for a given Person instance, which Aviaries have every bird on the person's list? And similarly, for a given Person instance, how do I find which Aviaries have only birds on the person's list (but not necessarily all of them).
Here are my Django 1.5 models:
class Bird(models.Model):
name = models.CharField(max_length=255, unique=True)
class Aviary(models.Model):
name = models.CharField(max_length=255, unique=True)
birds = models.ManyToManyField(Bird)
class Person(models.Model):
name = models.CharField(max_length=255, unique=True)
birds_to_see = models.ManyToManyField(Bird)
I know how I would find the aviaries that have at least one of a person's birds, but I don't see how I would adapt that here. (See, for instance: django queryset for many-to-many field)
If there is a query that does what I want, I am also interested to know if/why it's preferable to doing this more "manually." For example, I could loop over aviaries, extract each aviary's bird list, and see if the person's birds_to_see is a subset or superset of the aviary's bird list:
def find_aviaries(self):
person_birds = set(self.birds_to_see.all())
found_aviaries = []
for aviary in Aviary.objects.all():
aviary_birds = set(aviary.birds.all())
if person_birds.issubset(aviary_birds):
found_aviaries.append(aviary)
return found_aviaries
Any help is appreciated!
A nice solution exists for Django >= 2.0. It is possible to annotate Aviaries by the number of matching Birds and to filter Aviaries that match at least one Bird or a required number.
from django.db.models import Count
...
person_birds = set(self.birds_to_see.all())
aviaries = (
Aviary.objects
.annotate(bird_match_count=Count('birds', filter=Q(birds__in=person_birds)))
.filter(bird_match_count__gt=0)
)
It is then trivial to filter a new queryset by bird_match_count=len(person_birds)
or to filter the original queryset in Python or to sort it by bird_match_count.
Django < 2.0 would require to reference the intermediary model AviaryBirds
and will be more verbose.
Verified by reading the SQL:
>>> print(aviaries.query)
SELECT aviary.id, aviary.name,
COUNT(CASE WHEN aviary_birds.bird_id IN (1,..) THEN aviary_birds.bird_id ELSE NULL END)
AS bird_match_count
FROM aviary LEFT OUTER JOIN aviary_birds ON (aviary.id = aviary_birds.aviary_id)
GROUP BY aviary.id, aviary.name
HAVING
COUNT(CASE WHEN (aviary_birds.bird_id IN (1,..)) THEN aviary_birds.bird_id ELSE NULL END)
> 0
Using a Postgres Subquery Array construct, you can annotate on the ids, and then filter accordingly:
birds = Aviary.birds.through.objects.filter(
aviary=OuterRef('pk')
).values('bird')
aviaries = Aviary.objects.annotate(
bird_ids=SubqueryArray(birds)
).filter(bird_ids__contains=target_bird_ids)
You can also use __contained_by
to go the other way (or __overlap
if you just want any match).
All you need then is a suitable SubqueryArray
class:
class SubqueryArray(django.db.models.expressions.Subquery):
template = 'ARRAY(%(subquery)s)'
output_field = ArrayField(base_field=models.CharField())
You may need to adjust the output field on that, depending upon what your PK field is.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With