Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregate difference between DateTime fields in Django

I have a table containing a series of entries which relate to time periods (specifically, time worked for a client):

task_time:
id     |    start_time    |    end_time       |    client (fk)
1        08/12/2011 14:48   08/12/2011 14:50     2

I am trying to aggregate all the time worked for a given client, from my Django app:

time_worked_aggregate = models.TaskTime.objects.\
                        filter(client = some_client_id).\
                        extra(select = {'elapsed': 'SUM(task_time.end_time - task_time.start_time)'}).\
                        values('elapsed')

if len(time_worked_aggregate) > 0:
    time_worked = time_worked_aggregate[0]['elapsed'].total_seconds()
else:
    time_worked = 0

This seems inelegant, but it does work. Or at least so I thought: it turns out that it works fine on a PostgreSQL database, but when I move over to SQLite, everything dies.

A bit of digging suggests that the reason for this is that DateTimes aren't first-class data in SQLite. The following raw SQLite query will do my job:

SELECT SUM(strftime('%s', end_time) - strftime('%s', start_time)) FROM task_time WHERE ...;

My question is as follows:

  • The Python sample above seems roundabout. Can we do this more elegantly?
  • More importantly at this stage, can we do it in a way that will work on both Postgres and SQLite? Ideally, I'd like not to be writing raw SQL queries and switching on the database backend that happens to be in place; in general, Django is extremely good at protecting us from this. Does Django have a reasonable abstraction for this operation? If not, what's a sensible way for me to do a conditional switch on the backend?

I should mention for context that the dataset is many thousands of entries; the following is not really practical:

sum([task_time.end_date - task_time.start_date for task_time in models.TaskTime.objects.filter(...)])
like image 409
jelford Avatar asked Dec 08 '11 14:12

jelford


2 Answers

Almost the same solution as @andri proposed. In the final result you will get the same data. ExpressionWrapper - New in Django 1.8.

from datetime import timedelta
from django.db.models import ExpressionWrapper, F, fields
from app.models import MyModel

duration = ExpressionWrapper(F('closed_at') - F('opened_at'), output_field=fields.DurationField())
objects = MyModel.objects.closed().annotate(duration=duration).filter(duration__gt=timedelta(seconds=2))

for obj in objects:
    print obj.id, obj.duration, obj.duration.seconds

# sample output
# 807 0:00:57.114017 57
# 800 0:01:23.879478 83
# 804 3:40:06.797188 13206
# 801 0:02:06.786300 126
like image 89
michal-michalak Avatar answered Nov 12 '22 01:11

michal-michalak


I think since Django 1.8 we can do better:

I would like just to draw the part with annotation, the further part with aggregation should be straightforward:

from django.db.models import F, Func
SomeModel.objects.annotate(
    duration = Func(F('end_date'), F('start_date'), function='age')
)

[more about postgres age function here: http://www.postgresql.org/docs/8.4/static/functions-datetime.html ]

each instance of SomeModel will be anotated with duration field containg time difference, which in python will be a datetime.timedelta() object [more about datetime timedelta here: https://docs.python.org/2/library/datetime.html#timedelta-objects ]

like image 21
andilabs Avatar answered Nov 12 '22 01:11

andilabs