How to annotate Count with a condition in a Django queryset

Tags:

Using Django ORM, can one do something like queryset.objects.annotate(Count('queryset_objects', gte=VALUE)). Catch my drift?

Here's a quick example to use for illustrating a possible answer:

In a Django website, content creators submit articles, and regular users view (i.e. read) the said articles. Articles can either be published (i.e. available for all to read), or in draft mode. The models depicting these requirements are:

class Article(models.Model):     author = models.ForeignKey(User)     published = models.BooleanField(default=False)  class Readership(models.Model):     reader = models.ForeignKey(User)     which_article = models.ForeignKey(Article)     what_time = models.DateTimeField(auto_now_add=True)

My question is: How can I get all published articles, sorted by unique readership from the last 30 mins? I.e. I want to count how many distinct (unique) views each published article got in the last half an hour, and then produce a list of articles sorted by these distinct views.

I tried:

date = datetime.now()-timedelta(minutes=30) articles = Article.objects.filter(published=True).extra(select = {   "views" : """   SELECT COUNT(*)   FROM myapp_readership     JOIN myapp_article on myapp_readership.which_article_id = myapp_article.id   WHERE myapp_readership.reader_id = myapp_user.id   AND myapp_readership.what_time > %s """ % date, }).order_by("-views")

This sprang the error: syntax error at or near "01" (where "01" was the datetime object inside extra). It's not much to go on.

899

asked Nov 18 '15 08:11

Hassan Baig

Video Answer

1 Answers

For django >= 1.8

Use Conditional Aggregation:

from django.db.models import Count, Case, When, IntegerField Article.objects.annotate(     numviews=Count(Case(         When(readership__what_time__lt=treshold, then=1),         output_field=IntegerField(),     )) )

Explanation: normal query through your articles will be annotated with numviews field. That field will be constructed as a CASE/WHEN expression, wrapped by Count, that will return 1 for readership matching criteria and NULL for readership not matching criteria. Count will ignore nulls and count only values.

You will get zeros on articles that haven't been viewed recently and you can use that numviews field for sorting and filtering.

Query behind this for PostgreSQL will be:

SELECT     "app_article"."id",     "app_article"."author",     "app_article"."published",     COUNT(         CASE WHEN "app_readership"."what_time" < 2015-11-18 11:04:00.000000+01:00 THEN 1         ELSE NULL END     ) as "numviews" FROM "app_article" LEFT OUTER JOIN "app_readership"     ON ("app_article"."id" = "app_readership"."which_article_id") GROUP BY "app_article"."id", "app_article"."author", "app_article"."published"

If we want to track only unique queries, we can add distinction into Count, and make our When clause to return value, we want to distinct on.

from django.db.models import Count, Case, When, CharField, F Article.objects.annotate(     numviews=Count(Case(         When(readership__what_time__lt=treshold, then=F('readership__reader')), # it can be also `readership__reader_id`, it doesn't matter         output_field=CharField(),     ), distinct=True) )

That will produce:

SELECT     "app_article"."id",     "app_article"."author",     "app_article"."published",     COUNT(         DISTINCT CASE WHEN "app_readership"."what_time" < 2015-11-18 11:04:00.000000+01:00 THEN "app_readership"."reader_id"         ELSE NULL END     ) as "numviews" FROM "app_article" LEFT OUTER JOIN "app_readership"     ON ("app_article"."id" = "app_readership"."which_article_id") GROUP BY "app_article"."id", "app_article"."author", "app_article"."published"

For django < 1.8 and PostgreSQL

You can just use raw for executing SQL statement created by newer versions of django. Apparently there is no simple and optimized method for querying that data without using raw (even with extra there are some problems with injecting required JOIN clause).

Articles.objects.raw('SELECT'     '    "app_article"."id",'     '    "app_article"."author",'     '    "app_article"."published",'     '    COUNT('     '        DISTINCT CASE WHEN "app_readership"."what_time" < 2015-11-18 11:04:00.000000+01:00 THEN "app_readership"."reader_id"'     '        ELSE NULL END'     '    ) as "numviews"'     'FROM "app_article" LEFT OUTER JOIN "app_readership"'     '    ON ("app_article"."id" = "app_readership"."which_article_id")'     'GROUP BY "app_article"."id", "app_article"."author", "app_article"."published"')

146

answered Sep 25 '22 01:09

GwynBleidD

Related questions
                            
                                How to modify datetime.datetime.hour in Python?
                            
                                What should I use instead of syncdb in Django 1.9?
                            
                                How to make Django's DateTimeField optional?
                            
                                Finding the intersection between two series in Pandas
                            
                                How to get a list of all indexes in python-elasticsearch
                            
                                ImportError: No module named 'selenium'
                            
                                How to check if an element of a list is a list (in Python)?
                            
                                Rotate image and crop out black borders
                            
                                How to convert rows in DataFrame in Python to dictionaries
                            
                                Why does defining __getitem__ on a class make it iterable in python?
                            
                                Convert png to jpeg using Pillow
                            
                                Elegant way to perform tuple arithmetic
                            
                                How to install cryptography on ubuntu?
                            
                                Corpora/stopwords not found when import nltk library
                            
                                Python Argument Binders
                            
                                Cannot open include file: 'io.h': No such file or directory
                            
                                Callable modules
                            
                                Factorial in numpy and scipy
                            
                                can you add HTTPS functionality to a python flask web server?
                            
                                pip install pygraphviz: No package 'libcgraph' found

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to annotate Count with a condition in a Django queryset

Tags:

python

django

django-queryset