The Development version of Django has aggregate functions like Avg, Count, Max, Min, StdDev, Sum, and Variance (link text). Is there a reason Median is missing from the list?
Implementing one seems like it would be easy. Am I missing something? How much are the aggregate functions doing behind the scenes?
Here's your missing function. Pass it a queryset and the name of the column that you want to find the median for:
def median_value(queryset, term):
count = queryset.count()
return queryset.values_list(term, flat=True).order_by(term)[int(round(count/2))]
That wasn't as hard as some of the other responses seem to indicate. The important thing is to let the db sorting do all of the work, so if you have the column already indexed, this is a super cheap operation.
(update 1/28/2016) If you want to be more strict about the definition of median for an even number of items, this will average together the value of the two middle values.
def median_value(queryset, term):
count = queryset.count()
values = queryset.values_list(term, flat=True).order_by(term)
if count % 2 == 1:
return values[int(round(count/2))]
else:
return sum(values[count/2-1:count/2+1])/Decimal(2.0)
Because median isn't a SQL aggregate. See, for example, the list of PostgreSQL aggregate functions and the list of MySQL aggregate functions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With