Basically I'm trying to pull a random poll question that a user has not yet responded to from a database. This query takes about 10-20 seconds to execute, which is obviously no good! The responses table is about 30K rows and the database also has about 300 questions.
SELECT questions.id
FROM questions
LEFT JOIN responses ON ( questions.id = responses.questionID
AND responses.username = 'someuser' )
WHERE
responses.username IS NULL
ORDER BY RAND() ASC
LIMIT 1
PK for questions and reponses tables is 'id' if that matters.
Any advice would be greatly appreciated.
The joins can be slow if large portions of records from each side need to be scanned. Even if an index is defined on account_customer , all records from the latter still need to be scanned.
If the tables involved in the join operation are too small, say they have less than 10 records and the tables do not possess sufficient indexes to cover the query, in that case, the Left Join is generally faster than Inner Join. As you can see above, both the queries have returned the same result set.
You most likely need an index on
responses.questionID
responses.username
Without the index searching through 30k rows will always be slow.
Here's a different approach to the query which might be faster:
SELECT q.id
FROM questions q
WHERE q.id NOT IN (
SELECT r.questionID
FROM responses r
WHERE r.username = 'someuser'
)
Make sure there is an index on r.username
and that should be pretty quick.
The above will return all the unanswered questios. To choose the random one, you could go with the inefficient (but easy) ORDER BY RAND() LIMIT 1
, or use the method suggested by Tom Leys.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With