Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django annotate() multiple times causes wrong answers

Django has the great new annotate() function for querysets. However I can't get it to work properly for multiple annotations in a single queryset.

For example,

tour_list = Tour.objects.all().annotate( Count('tourcomment') ).annotate( Count('history') ) 

A tour can contain multiple tourcomment and history entries. I'm trying to get how many comments and history entries exist for this tour. The resulting

history__count and tourcomment__count 

values will be incorrect. If there's only one annotate() call the value will be correct.

There seems to be some kind of multiplicative effect coming from the two LEFT OUTER JOINs. For example, if a tour has 3 histories and 3 comments, 9 will be the count value for both. 12 histories + 1 comment = 12 for both values. 1 history + 0 comment = 1 history, 0 comments (this one happens to return the correct values).

The resulting SQL call is:

SELECT `testapp_tour`.`id`, `testapp_tour`.`operator_id`, `testapp_tour`.`name`, `testapp_tour`.`region_id`, `testapp_tour`.`description`, `testapp_tour`.`net_price`, `testapp_tour`.`sales_price`, `testapp_tour`.`enabled`, `testapp_tour`.`num_views`, `testapp_tour`.`create_date`, `testapp_tour`.`modify_date`, `testapp_tour`.`image1`, `testapp_tour`.`image2`, `testapp_tour`.`image3`, `testapp_tour`.`image4`, `testapp_tour`.`notes`, `testapp_tour`.`pickup_time`, `testapp_tour`.`dropoff_time`, COUNT(`testapp_tourcomment`.`id`) AS `tourcomment__count`, COUNT(`testapp_history`.`id`) AS `history__count`  FROM `testapp_tour` LEFT OUTER JOIN `testapp_tourcomment` ON (`testapp_tour`.`id` = `testapp_tourcomment`.`tour_id`) LEFT OUTER JOIN `testapp_history` ON (`testapp_tour`.`id` = `testapp_history`.`tour_id`) GROUP BY `testapp_tour`.`id` ORDER BY `testapp_tour`.`name` ASC 

I have tried combining the results from two querysets that contain a single call to annotate (), but it doesn't work right... You can't really guarantee that the order will be the same. and it seems overly complicated and messy so I've been looking for something better...

tour_list = Tour.objects.all().filter(operator__user__exact = request.user ).filter(enabled__exact = True).annotate( Count('tourcomment') ) tour_list_historycount = Tour.objects.all().filter( enabled__exact = True ).annotate( Count('history') ) for i,o in enumerate(tour_list):     o.history__count = tour_list_historycount[i].history__count 

Thanks for any help. Stackoverflow has saved my butt in the past with a lot of already-answered questions, but I wasn't able to find an answer to this one yet.

like image 895
tbak Avatar asked Aug 12 '09 09:08

tbak


People also ask

How does annotate work in Django?

In the Django framework, both annotate and aggregate are responsible for identifying a given value set summary. Among these, annotate identifies the summary from each of the items in the queryset. Whereas in the case of aggregate, the summary is calculated for the entire queryset.

What is difference between annotate and aggregate Django?

Aggregate calculates values for the entire queryset. Annotate calculates summary values for each item in the queryset.

Why annotate is used in Django?

The Django ORM is a convenient way to extract data from the database, and the annotate() clause that you can use with QuerySets is also a useful way to dynamically generate additional data for each object when the data is being extracted.


1 Answers

Thanks for your comment. That didn't quite work but it steered me in the right direction. I was finally able to solve this by adding distinct to both Count() calls:

Count('tourcomment', distinct=True) 
like image 95
tbak Avatar answered Sep 25 '22 07:09

tbak