Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Missing 'Median' Aggregate Function in Django?

The Development version of Django has aggregate functions like Avg, Count, Max, Min, StdDev, Sum, and Variance (link text). Is there a reason Median is missing from the list?

Implementing one seems like it would be easy. Am I missing something? How much are the aggregate functions doing behind the scenes?

like image 583
sutee Avatar asked Jun 03 '09 00:06

sutee


2 Answers

Here's your missing function. Pass it a queryset and the name of the column that you want to find the median for:

def median_value(queryset, term):
    count = queryset.count()
    return queryset.values_list(term, flat=True).order_by(term)[int(round(count/2))]

That wasn't as hard as some of the other responses seem to indicate. The important thing is to let the db sorting do all of the work, so if you have the column already indexed, this is a super cheap operation.

(update 1/28/2016) If you want to be more strict about the definition of median for an even number of items, this will average together the value of the two middle values.

def median_value(queryset, term):
    count = queryset.count()
    values = queryset.values_list(term, flat=True).order_by(term)
    if count % 2 == 1:
        return values[int(round(count/2))]
    else:
        return sum(values[count/2-1:count/2+1])/Decimal(2.0)
like image 77
Mark Chackerian Avatar answered Sep 30 '22 20:09

Mark Chackerian


Because median isn't a SQL aggregate. See, for example, the list of PostgreSQL aggregate functions and the list of MySQL aggregate functions.

like image 36
jacobian Avatar answered Sep 30 '22 18:09

jacobian