Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django ORM group by, and find latest item of each group (window functions)

Say we have a model as below

class Cake(models.Model):

    baked_on = models.DateTimeField(auto_now_add=True)
    cake_name = models.CharField(max_length=20)

Now, there are multiple Cakes baked on the same day, and I need a query that will return me a monthly cake report which consists of each day of the month, and the names of the first and last cakes baked on that day.

For example, if the data is something like this:

baked_on        cake_name
11 Jan 12:30    Vanilla
11 Jan 14:30    Strawberry
11 Jan 20:45    Avocado
12 Jan 09:05    Raspberry
12 Jan 16:30    Sprinkles
12 Jan 20:11    Chocolate

My query's output should look like

date    first     last
11 Jan  Vanilla   Avocado
12 Jan  Raspberry Chocolate

How should I go about doing this in a single ORM call?

like image 719
Anomitra Avatar asked Mar 19 '18 15:03

Anomitra


1 Answers

Django 2.0 introduced window functions that are made for that kind of queries. Simple answer for your question will be:

Cake.objects.annotate(
    first_cake=Window(
        expression=FirstValue('cake_name'),
        partition_by=[TruncDate('baked_on')],
        order_by=F('baked_on').asc(),
    ),
    last_cake=Window(
        expression=FirstValue('cake_name'),
        partition_by=[TruncDate('baked_on')],
        order_by=F('baked_on').desc(),
    ),
    day=TruncDate('baked_on'),
).distinct().values_list('day', 'first_cake', 'last_cake')

Why FirstValue in last_cake? That's becaues window query by default will traverse through each row and won't look ahead, so for every row, last row will be equal to current row. Using last_row together with descending sorting will fix that. Either that or you can define frame for which window query should work:

Cake.objects.annotate(
    first_cake=Window(
        expression=FirstValue('cake_name'),
        partition_by=[TruncDate('baked_on')],
        order_by=F('baked_on').asc(),
    ),
    last_cake=Window(
        expression=LastValue('cake_name'),
        partition_by=[TruncDate('baked_on')],
        order_by=F('baked_on').asc(),
        frame=ValueRange(),
    ),
    day=TruncDate('baked_on'),
).distinct().values_list('day', 'first_cake', 'last_cake')
like image 126
GwynBleidD Avatar answered Nov 15 '22 13:11

GwynBleidD