Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select items that are the top N results for a related table

Say I have a game where a question is asked, people post responses which are scored, and the top 10 responses win. I have a SQL database that stores all of this information, so I might have tables such as Users, Questions, and Responses. The Responses table has foreign_keys user_id and question_id, and attribute total_score.

Obviously, for a particular Question I can retrieve the top 10 Responses with an order and limit:

SELECT * FROM Responses WHERE question_id=? ORDER BY total_score DESC LIMIT 10;

What I'm looking for is a way I can determine, for a particular User, a list of all their Responses that are winners (in the top 10 for their particular Question). It is simple programmatically to step through each Response and see if it is included in the top 10 for its Question, but I would like to optimize this so I am not doing N+1 queries where N is the number of Responses the User has submitted.

like image 487
MikeJ Avatar asked Jul 01 '10 19:07

MikeJ


1 Answers

If you use Oracle, Microsoft SQL Server, DB2, or PostgreSQL, these databases support windowing functions. Join the user's responses to other responses to the same question. Then partition by question and order by score descending. Use the row number within each partition to restrict the set to those in the top 10. Also pass along the user_id of the given user so you can pick them out of the top 10, since you're only interested in the given user's responses.

SELECT *
FROM (
  SELECT r1.user_id AS given_user, r2.*,
    ROW_NUMBER() OVER (PARTITION BY r2.question_id ORDER BY r2.total_score DESC) AS rownum
  FROM Responses r1 JOIN Responses r2 ON r1.question_id = r2.question_id
  WHERE r1.user_id = ?
) t
WHERE rownum <= 10 AND user_id = given_user;

However, if you use MySQL or SQLite or other databases that don't support windowing functions, you can use this different solution:

Query for the user's responses, and use a join to match other responses to the respective questions with greater score (or earlier PK in the case of ties). Group by question, and count the number of responses that have higher score. If the count is fewer than 10, then the user's response is among the top 10 per question.

SELECT r1.*
FROM Responses r1
LEFT OUTER JOIN Responses r2 ON r1.question_id = r2.question_id 
  AND (r1.total_score < r2.total_score 
    OR r1.total_score = r2.total_score AND r1.response_id > r2.response_id)
WHERE r1.user_id = ?
GROUP BY r1.question_id
HAVING COUNT(*) < 10;
like image 59
Bill Karwin Avatar answered Sep 16 '22 20:09

Bill Karwin