Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django filter queryset __in for *every* item in list

People also ask

How can I filter a Django query with a list of values?

To filter a Python Django query with a list of values, we can use the filter method with in . to search Blog entries with pk set to 1,4 or 7 by calling Blog. objects. filter with the pk_in argument set to [1, 4, 7] .

What is __ In Django-filter?

Field Lookups. Please note the double underscore ( __ ) between the field name(depe_desc) and lookup type keyword contains. When Django execute the filter method with above field lookup argument, it will translate it to below sql statements.

Can I filter a QuerySet Django?

The filter() method is used to filter you search, and allows you to return only the rows that matches the search term.

What is the use of QuerySet in Django?

A QuerySet is a collection of data from a database. A QuerySet is built up as a list of objects. QuerySets makes it easier to get the data you actually need, by allowing you to filter and order the data.


Summary:

One option is, as suggested by jpic and sgallen in the comments, to add .filter() for each category. Each additional filter adds more joins, which should not be a problem for small set of categories.

There is the aggregation approach. This query would be shorter and perhaps quicker for a large set of categories.

You also have the option of using custom queries.


Some examples

Test setup:

class Photo(models.Model):
    tags = models.ManyToManyField('Tag')

class Tag(models.Model):
    name = models.CharField(max_length=50)

    def __unicode__(self):
        return self.name

In [2]: t1 = Tag.objects.create(name='holiday')
In [3]: t2 = Tag.objects.create(name='summer')
In [4]: p = Photo.objects.create()
In [5]: p.tags.add(t1)
In [6]: p.tags.add(t2)
In [7]: p.tags.all()
Out[7]: [<Tag: holiday>, <Tag: summer>]

Using chained filters approach:

In [8]: Photo.objects.filter(tags=t1).filter(tags=t2)
Out[8]: [<Photo: Photo object>]

Resulting query:

In [17]: print Photo.objects.filter(tags=t1).filter(tags=t2).query
SELECT "test_photo"."id"
FROM "test_photo"
INNER JOIN "test_photo_tags" ON ("test_photo"."id" = "test_photo_tags"."photo_id")
INNER JOIN "test_photo_tags" T4 ON ("test_photo"."id" = T4."photo_id")
WHERE ("test_photo_tags"."tag_id" = 3  AND T4."tag_id" = 4 )

Note that each filter adds more JOINS to the query.

Using annotation approach:

In [29]: from django.db.models import Count
In [30]: Photo.objects.filter(tags__in=[t1, t2]).annotate(num_tags=Count('tags')).filter(num_tags=2)
Out[30]: [<Photo: Photo object>]

Resulting query:

In [32]: print Photo.objects.filter(tags__in=[t1, t2]).annotate(num_tags=Count('tags')).filter(num_tags=2).query
SELECT "test_photo"."id", COUNT("test_photo_tags"."tag_id") AS "num_tags"
FROM "test_photo"
LEFT OUTER JOIN "test_photo_tags" ON ("test_photo"."id" = "test_photo_tags"."photo_id")
WHERE ("test_photo_tags"."tag_id" IN (3, 4))
GROUP BY "test_photo"."id", "test_photo"."id"
HAVING COUNT("test_photo_tags"."tag_id") = 2

ANDed Q objects would not work:

In [9]: from django.db.models import Q
In [10]: Photo.objects.filter(Q(tags__name='holiday') & Q(tags__name='summer'))
Out[10]: []
In [11]: from operator import and_
In [12]: Photo.objects.filter(reduce(and_, [Q(tags__name='holiday'), Q(tags__name='summer')]))
Out[12]: []

Resulting query:

In [25]: print Photo.objects.filter(Q(tags__name='holiday') & Q(tags__name='summer')).query
SELECT "test_photo"."id"
FROM "test_photo"
INNER JOIN "test_photo_tags" ON ("test_photo"."id" = "test_photo_tags"."photo_id")
INNER JOIN "test_tag" ON ("test_photo_tags"."tag_id" = "test_tag"."id")
WHERE ("test_tag"."name" = holiday  AND "test_tag"."name" = summer )

Another approach that works, although PostgreSQL only, is using django.contrib.postgres.fields.ArrayField:

Example copied from docs:

>>> Post.objects.create(name='First post', tags=['thoughts', 'django'])
>>> Post.objects.create(name='Second post', tags=['thoughts'])
>>> Post.objects.create(name='Third post', tags=['tutorial', 'django'])

>>> Post.objects.filter(tags__contains=['thoughts'])
<QuerySet [<Post: First post>, <Post: Second post>]>

>>> Post.objects.filter(tags__contains=['django'])
<QuerySet [<Post: First post>, <Post: Third post>]>

>>> Post.objects.filter(tags__contains=['django', 'thoughts'])
<QuerySet [<Post: First post>]>

ArrayField has some more powerful features such as overlap and index transforms.


This also can be done by dynamic query generation using Django ORM and some Python magic :)

from operator import and_
from django.db.models import Q

categories = ['holiday', 'summer']
res = Photo.filter(reduce(and_, [Q(tags__name=c) for c in categories]))

The idea is to generate appropriate Q objects for each category and then combine them using AND operator into one QuerySet. E.g. for your example it'd be equal to

res = Photo.filter(Q(tags__name='holiday') & Q(tags__name='summer'))

If you struggled with this problem as i did and nothing mentioned helped you, maybe this one will solve your issue

Instead of chaining filter, in some cases it would be better just to store ids of previous filter

tags = [1, 2]
for tag in tags:
    ids = list(queryset.filter(tags__id=tag).values_list("id", flat=True))
    queryset = queryset.filter(id__in=ids)

Using this approach will help you to avoid stacking JOIN in SQL query:


I use a little function that iterates filters over a list for a given operator an a column name :

def exclusive_in (cls,column,operator,value_list):         
    myfilter = column + '__' + operator
    query = cls.objects
    for value in value_list:
        query=query.filter(**{myfilter:value})
    return query  

and this function can be called like that:

exclusive_in(Photo,'tags__name','iexact',['holiday','summer'])

it also work with any class and more tags in the list; operators can be anyone like 'iexact','in','contains','ne',...