Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pivoting data and complex annotations in Django ORM

The ORM in Django lets us easily annotate (add fields to) querysets based on related data, hwoever I can't find a way to get multiple annotations for different filtered subsets of related data.

This is being asked in relation to django-helpdesk, an open-source Django-powered trouble-ticket tracker. I need to have data pivoted like this for charting and reporting purposes

Consider these models:

CHOICE_LIST = (
    ('open', 'Open'),
    ('closed', 'Closed'),
)

class Queue(models.model):
    name = models.CharField(max_length=40)

class Issue(models.Model):
    subject = models.CharField(max_length=40)
    queue = models.ForeignKey(Queue)
    status = models.CharField(max_length=10, choices=CHOICE_LIST)

And this dataset:

Queues:

ID | Name
---+------------------------------
1  | Product Information Requests
2  | Service Requests

Issues:

ID | Queue | Status
---+-------+---------
1  | 1     | open
2  | 1     | open
3  | 1     | closed
4  | 2     | open
5  | 2     | closed
6  | 2     | closed
7  | 2     | closed

I would like to see an annotation/aggregate look something like this:

Queue ID | Name                          | open | closed
---------+-------------------------------+------+--------
1        | Product Information Requests  | 2    | 1
2        | Service Requests              | 1    | 3

This is basically a crosstab or pivot table, in Excel parlance. I am currently building this output using some custom SQL queries, however if I can move to using the Django ORM I can more easily filter the data dynamically without doing dodgy insertion of WHERE clauses in my SQL.

For "bonus points": How would one do this where the pivot field (status in the example above) was a date, and we wanted the columns to be months / weeks / quarters / days?

like image 721
rossp Avatar asked May 11 '11 01:05

rossp


People also ask

What is annotate in Django ORM?

In the Django framework, both annotate and aggregate are responsible for identifying a given value set summary. Among these, annotate identifies the summary from each of the items in the queryset. Whereas in the case of aggregate, the summary is calculated for the entire queryset.

What is difference between annotate and aggregate in Django?

Unlike aggregate() , annotate() is not a terminal clause. The output of the annotate() clause is a QuerySet ; this QuerySet can be modified using any other QuerySet operation, including filter() , order_by() , or even additional calls to annotate() .

What does Django use for ORM?

The Django web framework includes a default object-relational mapping layer (ORM) that can be used to interact with application data from various relational databases such as SQLite, PostgreSQL and MySQL. The Django ORM is an implementation of the object-relational mapping (ORM) concept.

What is QuerySet annotate?

In simpler terms, annotate allows us to add a pseudo field to our queryset. This field can then further be used for filter lookups or ordering.\ Ex: Suppose you want to all the Books along with the total number of chapters present in each one, then you could use annotate as follows: Simple annotate example.


2 Answers

You have Python, use it.

from collections import defaultdict
summary = defaultdict( int )
for issue in Issues.objects.all():
    summary[issue.queue, issue.status] += 1

Now your summary object has queue, status as a two-tuple key. You can display it directly, using various template techniques.

Or, you can regroup it into a table-like structure, if that's simpler.

table = []
queues = list( q for q,_ in summary.keys() )
for q in sorted( queues ):
    table.append( q.id, q.name, summary.count(q,'open'), summary.count(q.'closed') )

You have lots and lots of Python techniques for doing pivot tables.

If you measure, you may find that a mostly-Python solution like this is actually faster than a pure SQL solution. Why? Mappings can be faster than SQL algorithms which require a sort as part of a GROUP-BY.

like image 90
S.Lott Avatar answered Oct 14 '22 05:10

S.Lott


Django has added a lot of functionality to the ORM since this question was originally asked. The answer to how to pivot data since Django 1.8 is to use the Case/When conditional expressions. And there is a third party app that will do that for you, PyPI and documentation

like image 30
Brad Martsberger Avatar answered Oct 14 '22 06:10

Brad Martsberger