Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django: Distinct foreign keys

class Log:
 project = ForeignKey(Project)
 msg = CharField(...)
 date = DateField(...)

I want to select the four most recent Log entries where each Log entry must have a unique project foreign key. I've tries the solutions on google search but none of them works and the django documentation isn't that very good for lookup..

I tried stuff like:

Log.objects.all().distinct('project')[:4]
Log.objects.values('project').distinct()[:4]
Log.objects.values_list('project').distinct('project')[:4]

But this either return nothing or Log entries of the same project..

Any help would be appreciated!

like image 924
mrmclovin Avatar asked Feb 08 '11 16:02

mrmclovin


2 Answers

Queries don't work like that - either in Django's ORM or in the underlying SQL. If you want to get unique IDs, you can only query for the ID. So you'll need to do two queries to get the actual Log entries. Something like:

id_list = Log.objects.order_by('-date').values_list('project_id').distinct()[:4]
entries = Log.objects.filter(id__in=id_list)
like image 113
Daniel Roseman Avatar answered Nov 20 '22 14:11

Daniel Roseman


Actually, you can get the project_ids in SQL. Assuming that you want the unique project ids for the four projects with the latest log entries, the SQL would look like this:

SELECT project_id, max(log.date) as max_date
FROM logs
GROUP BY project_id
ORDER BY max_date DESC LIMIT 4;

Now, you actually want all of the log information. In PostgreSQL 8.4 and later you can use windowing functions, but that doesn't work on other versions/databases, so I'll do it the more complex way:

SELECT logs.*
FROM logs JOIN (
    SELECT project_id, max(log.date) as max_date
    FROM logs
    GROUP BY project_id
    ORDER BY max_date DESC LIMIT 4 ) as latest
ON logs.project_id = latest.project_id
   AND logs.date = latest.max_date;

Now, if you have access to windowing functions, it's a bit neater (I think anyway), and certainly faster to execute:

SELECT * FROM (
   SELECT logs.field1, logs.field2, logs.field3, logs.date
       rank() over ( partition by project_id 
                     order by "date" DESC ) as dateorder
   FROM logs ) as logsort
WHERE dateorder = 1
ORDER BY logs.date DESC LIMIT 1;

OK, maybe it's not easier to understand, but take my word for it, it runs worlds faster on a large database.

I'm not entirely sure how that translates to object syntax, though, or even if it does. Also, if you wanted to get other project data, you'd need to join against the projects table.

like image 3
Josh Berkus Avatar answered Nov 20 '22 14:11

Josh Berkus