Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find top-X highest values in column using Django Queryset without cutting off ties at the bottom?

I have the following Django Model:

class myModel(models.Model):
    name = models.CharField(max_length=255, unique=True)
    score = models.FloatField()

There are thousands of values in the DB for this model. I would like to efficiently and elegantly use that QuerySets alone to get the top-ten highest scores and display the names with their scores in descending order of score. So far it is relatively easy.

Here is where the wrinkle is: If there are multiple myModels who are tied for tenth place, I want to show them all. I don't want to only see some of them. That would unduly give some names an arbitrary advantage over others. If absolutely necessary, I can do some post-DB list processing outside of Querysets. However, the main problem I see is that there is no way I can know apriori to limit my DB query to the top 10 elements since for all I know there may be a million records all tied for tenth place.

Do I need to get all the myModels sorted by score and then do one pass over them to calculate the score-threshold? And then use that calculated score-threshold as a filter in another Queryset?

If I wanted to write this in straight-SQL could I even do it in a single query?

like image 204
Saqib Ali Avatar asked Jan 14 '14 05:01

Saqib Ali


People also ask

How do I do a not equal in Django QuerySet filtering?

To answer your specific question, there is no "not equal to" but that's probably because django has both "filter" and "exclude" methods available so you can always just switch the logic round to get the desired result.

How do you get or view all the items in a model in Django?

The simplest way you can get the list of objects of an attribute is to first get a query-set of that attribute alone using values_list then converting the django query-set to a python set using set() and finally to a list using list() .

What is annotate in Django QuerySet?

Appending the annotate() clause onto a QuerySet lets you add an attribute to each item in the QuerySet, like if you wanted to count the amount of articles in each category. However, sometimes you only want to count objects that match a certain condition, for example only counting articles that are published.


1 Answers

Of course you can do it in one SQL query. Generating this query using django ORM is also easily achievable.

top_scores = (myModel.objects
                     .order_by('-score')
                     .values_list('score', flat=True)
                     .distinct())
top_records = (myModel.objects
                      .order_by('-score')
                      .filter(score__in=top_scores[:10]))

This should generate single SQL query (with subquery).

like image 143
Krzysztof Szularz Avatar answered Oct 05 '22 00:10

Krzysztof Szularz