Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

django annotate and count: how to filter the ones to include in count

Given a queryset, I add the count of related objects (ModelA) with the following:

qs = User.objets.all()
qs.annotate(modela__count=models.Count('modela'))

However, is there a way to count the ModelA that only meet a criteria? For example, count the ModelA where deleted_at is null?

I have tried two solutions which do not properly work.

1) As @knbk suggested, use filter before you annotate.

qs = User.objects.all().filter(modela__deleted_at__isnull=True).annotate(modela__count=models.Count('modela', distinct=True))

Here is the simplified version of the query generated by django:

SELECT COUNT(DISTINCT "modela"."id") AS "modela__count", "users".*
FROM "users"
LEFT OUTER JOIN "modela" ON ( "users"."id" = "modela"."user_id" ) 
WHERE "modela"."deleted_at" IS NULL 
GROUP BY "users"."id"

The problem comes from the WHERE clause. Indeed, there is a LEFT JOIN but the later WHERE conditions forced it to be a plain JOIN. I need to pull the conditions up into the JOIN clause to make it work as intended.

So, instead of

LEFT OUTER JOIN "modela" ON ( "users"."id" = "modela"."user_id" ) 
WHERE "modela"."deleted_at" IS NULL

I need the following which works when I execute it directly in plain SQL.

LEFT OUTER JOIN "modela" ON ( "users"."id" = "modela"."user_id" ) 
AND "modela"."deleted_at" IS NULL

How can I change the queryset to get this without doing a raw query?

2) As others suggested, I could use a conditional aggregation.

I tried the following:

qs = User.objects.all().annotate(modela__count=models.Count(Case(When(modela__deleted_at__isnull=True, then=1))))

which turns into the following SQL query:

SELECT COUNT(CASE WHEN "modela"."deleted_at" IS NULL THEN 1 ELSE NULL END) AS "modela__count", "users".*
FROM "users" LEFT OUTER JOIN "modela" ON ( "users"."id" = "modela"."user_id" )
GROUP BY "users"."id"

By doing that, I get all the users (so the LEFT JOIN works properly) but I get "1" (instead of 0) for modela__count for all the users who don't have any ModelA at all. Why do I get 1 and not 0 if there is nothing to count? How can that be changed?

like image 511
Michael Avatar asked Apr 03 '15 22:04

Michael


2 Answers

In a LEFT JOIN, every field of modela could be NULL because of the absence of corresponding row. So

modela.deleted_at IS NULL

...is not only true for the matching rows, but also true for those users whose have no corresponding modela rows.

I think the right SQL should be:

SELECT COUNT(
    CASE
      WHEN
        `modela`.`user_id` IS NOT NULL  -- Make sure modela rows exist
        AND `modela`.`deleted_at` IS NULL
        THEN 1
      ELSE NULL
    END
  ) AS `modela__count`,
  `users`.*
FROM `users`
LEFT OUTER JOIN `modela`
  ON ( `users`.`id` = `modela`.`user_id` )
GROUP BY `users`.`id`

In Django 1.8 this should be:

from django.db import models
qs = User.objects.all().annotate(
    modela_count=models.Count(
        models.Case(
            models.When(
                modela__user_id__isnull=False,
                modela__deleted_at__isnull=True,
                then=1,
            )
        )
    )
)

Notice:

@YAmikep discovered that a bug in Django 1.8.0 makes the generated SQL have an INNER JOIN instead of a LEFT JOIN, so you will lose rows without corresponding foreign key relationship. Use Django 1.8.2 or above version to fix that.

like image 83
Rockallite Avatar answered Oct 20 '22 00:10

Rockallite


In Django 1.8 I believe this can be achieved with conditional aggregation . However for previous versions I would do it with .extra

ModelA.objects.extra(select={
    'account_count': 'SELECT COUNT(*) FROM account WHERE modela.account_id = account.id AND account.some_prop IS NOT NULL'
})
like image 20
Todor Avatar answered Oct 19 '22 23:10

Todor