Given a queryset, I add the count of related objects (ModelA) with the following:
qs = User.objets.all()
qs.annotate(modela__count=models.Count('modela'))
However, is there a way to count the ModelA that only meet a criteria? For example, count the ModelA where deleted_at is null?
I have tried two solutions which do not properly work.
1) As @knbk suggested, use filter before you annotate.
qs = User.objects.all().filter(modela__deleted_at__isnull=True).annotate(modela__count=models.Count('modela', distinct=True))
Here is the simplified version of the query generated by django:
SELECT COUNT(DISTINCT "modela"."id") AS "modela__count", "users".*
FROM "users"
LEFT OUTER JOIN "modela" ON ( "users"."id" = "modela"."user_id" )
WHERE "modela"."deleted_at" IS NULL
GROUP BY "users"."id"
The problem comes from the WHERE clause. Indeed, there is a LEFT JOIN but the later WHERE conditions forced it to be a plain JOIN. I need to pull the conditions up into the JOIN clause to make it work as intended.
So, instead of
LEFT OUTER JOIN "modela" ON ( "users"."id" = "modela"."user_id" )
WHERE "modela"."deleted_at" IS NULL
I need the following which works when I execute it directly in plain SQL.
LEFT OUTER JOIN "modela" ON ( "users"."id" = "modela"."user_id" )
AND "modela"."deleted_at" IS NULL
How can I change the queryset to get this without doing a raw query?
2) As others suggested, I could use a conditional aggregation.
I tried the following:
qs = User.objects.all().annotate(modela__count=models.Count(Case(When(modela__deleted_at__isnull=True, then=1))))
which turns into the following SQL query:
SELECT COUNT(CASE WHEN "modela"."deleted_at" IS NULL THEN 1 ELSE NULL END) AS "modela__count", "users".*
FROM "users" LEFT OUTER JOIN "modela" ON ( "users"."id" = "modela"."user_id" )
GROUP BY "users"."id"
By doing that, I get all the users (so the LEFT JOIN works properly) but I get "1" (instead of 0) for modela__count
for all the users who don't have any ModelA at all.
Why do I get 1 and not 0 if there is nothing to count?
How can that be changed?
In a LEFT JOIN
, every field of modela
could be NULL
because of the absence of corresponding row. So
modela.deleted_at IS NULL
...is not only true for the matching rows, but also true for those users
whose have no corresponding modela
rows.
I think the right SQL should be:
SELECT COUNT(
CASE
WHEN
`modela`.`user_id` IS NOT NULL -- Make sure modela rows exist
AND `modela`.`deleted_at` IS NULL
THEN 1
ELSE NULL
END
) AS `modela__count`,
`users`.*
FROM `users`
LEFT OUTER JOIN `modela`
ON ( `users`.`id` = `modela`.`user_id` )
GROUP BY `users`.`id`
In Django 1.8 this should be:
from django.db import models
qs = User.objects.all().annotate(
modela_count=models.Count(
models.Case(
models.When(
modela__user_id__isnull=False,
modela__deleted_at__isnull=True,
then=1,
)
)
)
)
Notice:
@YAmikep discovered that a bug in Django 1.8.0 makes the generated SQL have an INNER JOIN
instead of a LEFT JOIN
, so you will lose rows without corresponding foreign key relationship. Use Django 1.8.2 or above version to fix that.
In Django 1.8 I believe this can be achieved with conditional aggregation . However for previous versions I would do it with .extra
ModelA.objects.extra(select={
'account_count': 'SELECT COUNT(*) FROM account WHERE modela.account_id = account.id AND account.some_prop IS NOT NULL'
})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With