Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tips for improving this slow mysql query?

I'm using a query which generally executes in under a second, but sometimes takes between 10-40 seconds to finish. I'm actually not totally clear on how the subquery works, I just know that it works, in that it gives me 15 rows for each faverprofileid.

I'm logging slow queries and it's telling me 5823244 rows were examined, which is odd because there aren't anywhere close to that many rows in any of the tables involved (the favorites table has the most at 50,000 rows).

Can anyone offer me some pointers? Is it an issue with the subquery and needing to use filesort?

EDIT: Running explain shows that the users table is not using an index (even though id is the primary key). Under extra it says: Using temporary; Using filesort.

SELECT F.id,F.created,U.username,U.fullname,U.id,I.*   
FROM favorites AS F  
INNER JOIN users AS U ON F.faver_profile_id = U.id  
INNER JOIN items AS I ON F.notice_id = I.id  
WHERE faver_profile_id IN (360,379,95,315,278,1)  
AND F.removed = 0  
AND I.removed = 0   
AND F.collection_id is null   
AND I.nudity = 0  
AND (SELECT COUNT(*) FROM favorites WHERE faver_profile_id = F.faver_profile_id  
AND created > F.created AND removed = 0 AND collection_id is null) < 15 
ORDER BY F.faver_profile_id, F.created DESC;
like image 742
makeee Avatar asked Dec 09 '22 21:12

makeee


1 Answers

The number of rows examined represents is large because many rows have been examined more than once. You are getting this because of an incorrectly optimized query plan which results in table scans when index lookups should have been performed. In this case the number of rows examined is exponential, i.e. of an order of magnitude comparable to the product of the total number of rows in more than one table.

  • Make sure that you have run ANALYZE TABLE on your three tables.
  • Read on how to avoid table scans, and identify then create any missing indexes
  • Rerun ANALYZE and re-explain your queries
    • the number of examined rows must drop dramatically
    • if not, post the full explain plan
  • use query hints to force the use of indices (to see the index names for a table, use SHOW INDEX):

SELECT F.id,F.created,U.username,U.fullname,U.id,I.*
FROM favorites AS F FORCE INDEX (faver_profile_id_key)
INNER JOIN users AS U FORCE INDEX FOR JOIN (PRIMARY) ON F.faver_profile_id = U.id
INNER JOIN items AS I FORCE INDEX FOR JOIN (PRIMARY) ON F.notice_id = I.id
WHERE faver_profile_id IN (360,379,95,315,278,1)
AND F.removed = 0
AND I.removed = 0
AND F.collection_id is null
AND I.nudity = 0
AND (SELECT COUNT(*) FROM favorites FORCE INDEX (faver_profile_id_key) WHERE faver_profile_id = F.faver_profile_id
AND created > F.created AND removed = 0 AND collection_id is null) < 15
ORDER BY F.faver_profile_id, F.created DESC;

You may also change your query to use GROUP BY faver_profile_id/HAVING count > 15 instead of the nested SELECT COUNT(*) subquery, as suggested by vartec. The performance of both your original and vartec's query should be comparable if both are properly optimized e.g. using hints (your query would use nested index lookups, whereas vartec's query would use a hash-based strategy.)

like image 199
vladr Avatar answered Dec 12 '22 09:12

vladr