Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django aggregate queries with expressions

I have a model XYZ and I need to get the max value for fields a, b, and expression x/y for a given queryset.

It works beautifully for fields. Something like:

>>> XYZ.all().aggregate(Max('a'))

... {'a__max': 10}

However, I can't find a way to do it for expressions. Trying something like:

>>> XYZ.all().aggregate(Max('x/y'))

Gives an error:

*** FieldError: Cannot resolve keyword 'x/y' into field. Choices are: a, b, x, y, id

Trying something like:

>>> XYZ.all().aggregate(Max(F('x')/F('y')))

Gives an error:

*** AttributeError: 'ExpressionNode' object has no attribute 'split'

And even something like:

XYZ.all().extra(select={'z':'x/y'}).aggregate(Max('z'))

Also doesn't work and gives the same error as above:

FieldError: Cannot resolve keyword 'z' into field. Choices are: a, b, x, y, id

The one hack I found to do it is:

XYZ.all().extra(select={'z':'MAX(x/y)'})[0].z

Which actually works because it generates the right SQL, but it's confusing because I do get the right value at the z atttribute, but not the right instance, the one with that max value.

Of course, I could also use raw queries or tricks with extra() and order_by(), but it really doesn't make sense to me that Django goes all the way to support aggregate queries in a nice way, but can't support expressions even with its own F expressions.

Is there any way to do it?

like image 487
Pedro Werneck Avatar asked Apr 19 '12 03:04

Pedro Werneck


People also ask

What is F expression in Django?

F() can be used to create dynamic fields on your models by combining different fields with arithmetic: company = Company. objects. annotate( chairs_needed=F('num_employees') - F('num_chairs')) If the fields that you're combining are of different types you'll need to tell Django what kind of field will be returned.

What is difference between annotate and aggregate in Django?

Unlike aggregate() , annotate() is not a terminal clause. The output of the annotate() clause is a QuerySet ; this QuerySet can be modified using any other QuerySet operation, including filter() , order_by() , or even additional calls to annotate() .

Does Django ORM support subquery?

¶ Django allows using SQL subqueries.


2 Answers

In SQL, what you want is actually

SELECT x/y, * FROM XYZ ORDER BY x/y DESC LIMIT 1;
# Or more verbose version of the #1
SELECT x/y, id, a, b, x, y FROM XYZ GROUP BY x/y, id, a, b, x, y ORDER BY x/y DESC LIMIT 1;
# Or
SELECT * FROM XYZ WHERE x/y = (SELECT MAX(x/y) FROM XYZ) LIMIT 1;

Thus in Django ORM:

XYZ.objects.extra(select={'z':'x/y'}).order_by('-z')[0]
# Or
XYZ.objects.extra(select={'z':'x/y'}).annotate().order_by('-z')[0]
# Or x/y=z => x=y*z
XYZ.objects.filter(x=models.F('y') * XYZ.objects.extra(select={'z':'MAX(x/y)'})[0].z)[0]

The version

XYZ.all().extra(select={'z':'MAX(x/y)'})[0].z

does not have correct x,y and instance because the MAX function is evaluated among all rows, when there is no GROUP BY, thus all instances in the returned QuerySet will have same value of z as MAX(x/y).

like image 152
okm Avatar answered Oct 16 '22 23:10

okm


Your example that uses F() objects should work fine since Django 1.8:

XYZ.all().aggregate(Max(F('x')/F('y')))

There's a snippet that demonstrates aggregation with Sum() and F() objects in the Django aggregation cheat sheet:

Book.objects.all().aggregate(price_per_page=Sum(F('price')/F('pages'))
like image 33
mrts Avatar answered Oct 16 '22 23:10

mrts