Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Order queryset by alternating value

I have the following model:

class Entry(models.Model):
    name = models.Charfield(max_length=255)
    client = models.Charfield(max_length=255)

client is the name of the client as could have values like facebook, google, and so on.

Is it possible to order the queryset so that the result is alternating the values of client? What I expect is something like this:

Entry.objects.order_by('alternate client') --> 

| client   | name   |
| google   | robert |
| facebook | linda  |
| google   | kate   | 
| facebook | jack   |
| google   | nina   |
| facebook | pierre |    

I am using django2.x and postgres if that helps.

EDIT:

Some additional info / requirements.

  • I have about 10 to 20 differtent clients
  • Entry does also have a created DateField. If possible result should also be ordered by date
  • I want to use pagination for Entry so solution should be using django's ORM
like image 910
ilse2005 Avatar asked May 01 '18 16:05

ilse2005


People also ask

What does QuerySet []> mean?

A QuerySet is a collection of data from a database. A QuerySet is built up as a list of objects. QuerySets makes it easier to get the data you actually need, by allowing you to filter and order the data.

What is a QuerySet?

A QuerySet represents a collection of objects from your database. It can have zero, one or many filters. Filters narrow down the query results based on the given parameters. In SQL terms, a QuerySet equates to a SELECT statement, and a filter is a limiting clause such as WHERE or LIMIT .

What does a QuerySet return?

Returns a QuerySet that returns dictionaries, rather than model instances, when used as an iterable. Each of those dictionaries represents an object, with the keys corresponding to the attribute names of model objects.


1 Answers

Since you use Postgres you can use its Window Functions which perform a calculation across a set of table rows that are somehow related to the current row. Another good information relies in the fact that you use Django2.x which supports Window Functions(Django docs) which allows adding an OVER clause to Querysets.

Your use-case can be resolved with Single ORM query like:

from django.db.models.expressions import Window
from django.db.models.functions import RowNumber
from django.db.models import F

results = Entry.objects.annotate(row_number=Window(
    expression=RowNumber(),
    partition_by=[F('client')],
    order_by=F('created').desc())
).order_by('row_number', 'client')

for result in results:
    print('Id: {} - client: {} - row_number {}'.format(result.id, result.client, result.row_number))

Output:

Id: 12 - client: facebook - row_number 1
Id: 13 - client: google - row_number 1
Id: 11 - client: facebook - row_number 2
Id: 8 - client: google - row_number 2
Id: 10 - client: facebook - row_number 3
Id: 5 - client: google - row_number 3
Id: 9 - client: facebook - row_number 4
Id: 3 - client: google - row_number 4
Id: 7 - client: facebook - row_number 5
Id: 2 - client: google - row_number 5
Id: 6 - client: facebook - row_number 6
Id: 1 - client: google - row_number 6
Id: 4 - client: facebook - row_number 7

The raw SQL looks like:

SELECT 
"orm_entry"."id",
"orm_entry"."name",
"orm_entry"."client",
"orm_entry"."created",
ROW_NUMBER() OVER (PARTITION BY "orm_entry"."client" ORDER BY "orm_entry"."created" DESC) AS "row_number" 
FROM "orm_entry" 
ORDER BY "row_number" ASC, "orm_entry"."client" ASC

Window functions are declared just as an aggregate function followed by an OVER clause, which indicates exactly how rows are being grouped. The group of rows onto which the window function is applied is called "partition".

You can notice that we grouped the rows by 'client' field and you can conclude that in our example we will have the two partitions. First partition will contain all the 'facebook' entries and second partition will contain all the 'google' entries. In its basic form, a partition is no different than a normal aggregate function group: simply a set of rows considered "equal" by some criteria, and the function will be applied over all these rows to return a single result.

In your example we can use the row_number window function which simply returns the index of the current row within its partition starting from 1. That helped me to establish the alternating output in order_by('row_number', 'client').

Additional information:

If you want to achieve an order like this:

'facebook','facebook', 'google','google','facebook','facebook','google','google'

or

'facebook','facebook','facebook','google','google','google','facebook', 'facebook','facebook'

You will need to do one small math related modification of the previous query like:

GROUP_SIZE = 2
results = Entry.objects.annotate(row_number=Window(
    expression=RowNumber(),
    partition_by=[F('client')],
    order_by=F('created').desc())
).annotate(row_number=(F('row_number') - 1)/GROUP_SIZE + 1).order_by('row_number', 'client')

for result in results:
    print('Id: {} - client: {} - row_number {}'.format(result.id, result.client, result.row_number))

Output:

Id: 12 - client: facebook - row_number 1
Id: 11 - client: facebook - row_number 1
Id: 8 - client: google - row_number 1
Id: 13 - client: google - row_number 1
Id: 10 - client: facebook - row_number 2
Id: 9 - client: facebook - row_number 2
Id: 3 - client: google - row_number 2
Id: 5 - client: google - row_number 2
Id: 7 - client: facebook - row_number 3
Id: 6 - client: facebook - row_number 3
Id: 1 - client: google - row_number 3
Id: 2 - client: google - row_number 3
Id: 4 - client: facebook - row_number 4

You can notice that GROUP_SIZE constant defines how many items will be in the each alternating group.

P.S.

Thank you for asking this question because it helped me to better understand the Window Functions.

Happy coding :)

like image 118
T.Tokic Avatar answered Oct 09 '22 15:10

T.Tokic