Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django: order a queryset by the sum of annotated fields?

I'm building a Django site for discussions. Users can participate in discussions and can also cast votes of approval for discussions and messages in the discussions. A simplified data model is as follows:

class Discussion:
    name = models.CharField(max_length=255)

class Message:
    owner = models.ForeignKey(User, related_name='messages')
    body = models.TextField()
    discussion = models.ForeignKey(Discussion, related_name='messages')

class MessageApprovalVote:
    owner = models.ForeignKey(User, related_name='message_approval_votes')
    message = models.ForeignKey(Message, related_name='approval_votes')

class DiscussionApprovalVote:
    owner = models.ForeignKey(User, related_name='discussion_approval_votes')
    discussion = models.ForeignKey(Discussion, related_name='approval_votes')

I want to select the top 20 "most active" discussions, which means ordering by the sum of the number of messages, total number of message approval votes, and number of discussion approval votes for that discussion, or (in pseudocode):

# Doesn't work
Discussion.objects.
    order_by(Count('messages') + 
             Count('approval_votes') + 
             Count('messages__approval_votes'))

Using annotations, I can calculate the totals of each of the three scoring factors:

scores = Discussion.objects.annotate(
    total_messages=Count('messages', distinct=True),
    total_discussion_approval_votes=Count('approval_votes', distinct=True),
    total_message_approval_votes=Count('messages__approval_votes', distinct=True))

I then thought I was on to something when I found the extra method:

total_scores = scores.extra(
    select={
        'score_total': 'total_messages + total_discussion_approval_votes + total_message_approval_votes'
    }
)

and would then be able to do:

final_answer = total_scores.order_by('-score_total')[:20]

but the extra call gives a DatabaseError:

DatabaseError: column "total_messages" does not exist
LINE 1: SELECT (total_votes + total_messages + total_persuasions) AS...

and thus I was foiled. Can the extra method not reference annotated fields? Is there another way to do what I'm trying to do, short of using a raw sql query? I'm using Postgres if that makes a difference.

Any insights would be greatly appreciated!

like image 981
davidscolgan Avatar asked Apr 03 '13 23:04

davidscolgan


People also ask

What is the difference between aggregate and annotate in Django?

Unlike aggregate() , annotate() is not a terminal clause. The output of the annotate() clause is a QuerySet ; this QuerySet can be modified using any other QuerySet operation, including filter() , order_by() , or even additional calls to annotate() .

What is annotate in Django QuerySet?

annotate()Annotates each object in the QuerySet with the provided list of query expressions.

Can we order data in descending order in Django?

if you want to select by Descending just add minus operator before the attribute field or if you want to select by Ascending no need minus operator.


1 Answers

I don't think this is possible in a single top-level SQL query. The score_total value depends on the three aggregate results, but you're asking them all to be calculated at the same time.

In straight SQL, you can do this with a subquery, but I'm not sure how to inject it into Django. After setting up a simple Django app with your models, the following query seems to do the trick against a SQLite database:

SELECT  id, name,
    total_messages, total_discussion_approval_votes, total_message_approval_votes,
    (total_messages +
     total_discussion_approval_votes +
     total_message_approval_votes) as score_total
FROM
   (SELECT
    discussion.id,
    discussion.name,
    COUNT(DISTINCT discussionapprovalvote.id) AS total_discussion_approval_votes,
    COUNT(DISTINCT messageapprovalvote.id) AS total_message_approval_votes,
    COUNT(DISTINCT message.id) AS total_messages
    FROM discussion
    LEFT OUTER JOIN discussionapprovalvote
         ON (discussion.id = discussionapprovalvote.discussion_id)
    LEFT OUTER JOIN message
         ON (discussion.id = message.discussion_id)
    LEFT OUTER JOIN messageapprovalvote
         ON (message.id = messageapprovalvote.message_id)
    GROUP BY discussion.id, discussion.name)

ORDER BY score_total DESC
LIMIT 20;
like image 151
Tom Nurkkala Avatar answered Oct 16 '22 12:10

Tom Nurkkala