Is it possible to calculate the cumulative (running) sum using django's orm? Consider the following model:
class AModel(models.Model):
a_number = models.IntegerField()
with a set of data where a_number = 1
. Such that I have a number ( >1 ) of AModel
instances in the database all with a_number=1
. I'd like to be able to return the following:
AModel.objects.annotate(cumsum=??).values('id', 'cumsum').order_by('id')
>>> ({id: 1, cumsum: 1}, {id: 2, cumsum: 2}, ... {id: N, cumsum: N})
Ideally I'd like to be able to limit/filter the cumulative sum. So in the above case I'd like to limit the result to cumsum <= 2
I believe that in postgresql one can achieve a cumulative sum using window functions. How is this translated to the ORM?
Cumulative sums, or running totals, are used to display the total sum of data as it grows with time (or any other series or progression). This lets you view the total contribution so far of a given measure against time.
Appending the annotate() clause onto a QuerySet lets you add an attribute to each item in the QuerySet, like if you wanted to count the amount of articles in each category. However, sometimes you only want to count objects that match a certain condition, for example only counting articles that are published.
For reference, starting with Django 2.0 it is possible to use the Window
function to achieve this result:
AModel.objects.annotate(cumsum=Window(Sum('a_number'), order_by=F('id').asc()))\
.values('id', 'cumsum').order_by('id', 'cumsum')
From Dima Kudosh's answer and based on https://stackoverflow.com/a/5700744/2240489 I had to do the following:
I removed the reference to PARTITION BY
in the sql and replaced with ORDER BY
resulting in.
AModel.objects.annotate(
cumsum=Func(
Sum('a_number'),
template='%(expressions)s OVER (ORDER BY %(order_by)s)',
order_by="id"
)
).values('id', 'cumsum').order_by('id', 'cumsum')
This gives the following sql:
SELECT "amodel"."id",
SUM("amodel"."a_number")
OVER (ORDER BY id) AS "cumsum"
FROM "amodel"
GROUP BY "amodel"."id"
ORDER BY "amodel"."id" ASC, "cumsum" ASC
Dima Kudosh's answer was not summing the results but the above does.
For posterity, I found this to be a good solution for me. I didn't need the result to be a QuerySet, so I could afford to do this, since I was just going to plot the data using D3.js:
import numpy as np
import datettime
today = datetime.datetime.date()
raw_data = MyModel.objects.filter('date'=today).values_list('a_number', flat=True)
cumsum = np.cumsum(raw_data)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With