Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django query select distinct by field pairs

I have the field 'submission' which has a user and a problem. How can I get an SQL search result which will give a list of only one result per user-problem pair?

Models are like this:

class Problem(models.Model):
    title = models.CharField('Title', max_length = 100)
    question = models.TextField('Question')

class Submission(models.Model):
    user = models.ForeignKey(User)
    problem = models.ForeignKey(Problem)
    solution = models.CharKey()
    time = models.DateTimeField('Time', auto_now_add=True)
like image 358
crodjer Avatar asked Aug 14 '10 12:08

crodjer


2 Answers

Try this:

distinct_users_problems = Submission.objects.all().values("user", "problem").distinct()

It will give you a list of dicts like this one:

[{'problem': 1, 'user': 1}, {'problem': 2, 'user': 1}, {'problem': 3, 'user': 1}]

containing all the distinct pairs.

It actually results in your usual SELECT DISTINCT SQL query.

like image 182
Tomek Kopczuk Avatar answered Oct 17 '22 07:10

Tomek Kopczuk


Update 2:

(After reading OP's comments) I suggest adding a new model to track the latest submission. Call it LatestSubmission.

class LatestSubmission(models.Model):
    user = models.ForeignKey(User)     
    problem = models.ForeignKey(Problem)
    submission = models.ForeignKey(Submission)

You can then either

  1. override Submission.save() to create/update the entry in LatestSubmission every time an user posts a new solution for a Problem
  2. attach a function that does the same to a suitable signal.

such that LatestSubmission will contain one row per problem-user-submission combination pointing to the latest submission for the problem by each user. Once you have this in place you can fire a single query:

LatestSubmission.objects.all().order_by('problem')

Update:

Since the OP has posted sample code, the solution can now be changed as follows:

for user in User.objects.all(): # Get all users
    user.submission_set.latest('time') # Pick the latest submission based on time.

Original Answer

In the absence of any date/time based criteria for deciding which is "older" or "newer", you can use the primary key (id) of Submission to "neglect the old ones".

for user in User.objects.all(): # Get all users
    user.submission_set.latest('id') # Pick the latest submission by each user.
like image 41
Manoj Govindan Avatar answered Oct 17 '22 07:10

Manoj Govindan