Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Annotate a queryset with the average date difference? (django)

I searched all over place for an answer to this but couldn't find anything. Perhaps this is just a stupid question or a really tricky one. Here it is:

Let's say my model is this (pseudo django code):

Event
  type = ForeignKey(EventType)
  name = CharField
  date_start = DateField
  date_end = DateField

EventType
  name = CharField

What I want to know is the average duration time for each event type. What I do now is calculate the average duration whenever a new event is created (save method) and have that stored in an average_duration column in EventType. The problem with this approach is that I cannot answer questions like "what was the average duration time for events of type X, during the year Y". So instead of adding more columns to answer questions like these I would prefer to have it done in "real-time".

Can this be done by annotating the queryset? First I would have to get the date differences for each event type, then come up with their average, and then annotate the Event queryset with that average, I assume.

like image 525
Nicolas R Avatar asked Jun 28 '10 08:06

Nicolas R


People also ask

What is annotate in Django QuerySet?

Appending the annotate() clause onto a QuerySet lets you add an attribute to each item in the QuerySet, like if you wanted to count the amount of articles in each category. However, sometimes you only want to count objects that match a certain condition, for example only counting articles that are published.

What is difference between annotate and aggregate Django?

In the Django framework, both annotate and aggregate are responsible for identifying a given value set summary. Among these, annotate identifies the summary from each of the items in the queryset. Whereas in the case of aggregate, the summary is calculated for the entire queryset.

Is Django QuerySet lazy?

This is because a Django QuerySet is a lazy object. It contains all of the information it needs to populate itself from the database, but will not actually do so until the information is needed.

What does QuerySet []> mean?

A QuerySet is a collection of data from a database. A QuerySet is built up as a list of objects. QuerySets makes it easier to get the data you actually need, by allowing you to filter and order the data.


2 Answers

Just an update. In Django >= 1.8 it is possible to do:

from django.db.models import F, ExpressionWrapper, fields

duration = ExpressionWrapper(F('date_end') - F('date_start'), output_field=fields.DurationField())

events_with_duration = Event.objects.annotate(duration=duration)

after which you can run queries like:

events_with_duration.filter(duration__gt=timedelta(days=10))
like image 89
ryuusenshi Avatar answered Sep 20 '22 05:09

ryuusenshi


You'll need to create a queryset with the extra method to add the date difference to each row

Then use the aggregate method to compute the average for your just added column:

Be careful though, this method is slow and won't scale. Storing the computed value on event_type is imho your best option.

like image 34
Guillaume Esquevin Avatar answered Sep 18 '22 05:09

Guillaume Esquevin