I have a basic model like:
class Stats(models.Model):
created = models.DateTimeField(auto_now_add=True)
growth = models.IntegerField()
I run a celery job every 10 minutes to create a new stats object.
Using .latest()
on a QuerySet
gives me the latest Stats object to date.
However, I want a list with one Stats object for each day.
Consider the following:
Stats(growth=100) #created 1/1/13 23:50
Stats(growth=200) #created 1/1/13 23:59
Stats(growth=111) #created 1/2/13 23:50
Stats(growth=222) #created 1/2/13 23:59
The QuerySet
should return the latest for each day. In the example the one with 200 and 222 growth.
In SQL I'd fire up a subquery with the max for each day and join it together.
Since I don't want to use raw SQL, is there a way to do this with the django ORM?
last() to retrieve last and Model. objects. all().
A QuerySet is a collection of data from a database. A QuerySet is built up as a list of objects. QuerySets makes it easier to get the data you actually need, by allowing you to filter and order the data.
first, which takes a query set and returns the first element, or None if the query set was empty. Note, that this is not identical. The get method will ensure that there is exactly one row in the database that matches the query.
Unfortunately there's no way (that I'm aware of.. I looked pretty hard) to avoid using some kind of raw sql to accomplish what you want to do (with your current model; see the very end for another suggestion). But you can minimise the impact by writing as little raw sql as possible. In practise django sites do not need to be portable across different databases. Unless you plan on using this app elsewhere or releasing it publicly, you should be ok.
The below example is for sqlite. You could keep a mapping of database types to date
functions, look up the type of driver, and replace the function with the correct one if you needed to.
>>> for stat in Stats.objects.all():
... print stat.created, stat.growth
...
2013-06-22 13:41:25.334262+00:00 3
2013-06-22 13:41:40.473373+00:00 3
2013-06-22 13:41:44.921247+00:00 4
2013-06-22 13:41:47.533102+00:00 5
2013-06-23 13:41:58.458250+00:00 6
2013-06-23 13:42:01.282702+00:00 3
2013-06-23 13:42:03.633236+00:00 1
>>> last_stat_per_day = Stats.objects.extra(
select={'the_date': 'date(created)' }
).values_list('the_date').annotate(max_date=Max('created'))
>>> last_stat_per_day
[(u'2013-06-22', datetime.datetime(2013, 6, 22, 13, 41, 47, 533102, tzinfo=<UTC>)), (u'2013-06-23', datetime.datetime(2013, 6, 23, 13, 42, 3, 633236, tzinfo=<UTC>))]
>>> max_dates = [item[1] for item in last_stat_per_day]
>>> max_dates
[datetime.datetime(2013, 6, 22, 13, 41, 47, 533102, tzinfo=<UTC>),
datetime.datetime(2013, 6, 23, 13, 42, 3, 633236, tzinfo=<UTC>)]
>>> stats = Stats.objects.filter(created__in=max_dates)
>>> for stat in stats:
... print stat.created, stat.growth
...
2013-06-22 13:41:47.533102+00:00 5
2013-06-23 13:42:03.633236+00:00 1
I had written here before that this was only a single query but I lied - the values_list needs to be transformed to only return the max_date for the successive query, which means running the statement. It's only 2 queries though, which would be significantly better than an N+1 function.
The non-portable bit is this:
last_stat_per_day = Stats.objects.extra(
select={'the_date': 'date(created)' }
).values_list('the_date').annotate(max_date=Max('created'))
Using extra
isn't ideal, but the raw sql here is simple, and lends itself nicely to a database driver dependent replacement. Only the date(created)
needs to be replaced. You can wrap this up in a method on a custom manager if you like, and you've then successfully abstracted this mess away in a single location.
The other option is to just add a DateField
to your model, and then you don't need to use extra at all. You'd simply replace the values_list
call with a values_list('created_date')
, remove the extra
completely, and call it a day. The cost is obvious - more storage space required. It's also non-intuitive as to why you have a Date
and a DateTime
field on the same model. Keeping the two in sync may also pose problems.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With