It is common to use SELECT
within SELECT
to reduce the number of queries; but as I examined this leads to slow query (which is obviously harmful for mysql performance). I had a simple query as
SELECT something
FROM posts
WHERE id IN (
SELECT tag_map.id
FROM tag_map
INNER JOIN tags
ON tags.tag_id=tag_map.tag_id
WHERE tag IN ('tag1', 'tag2', 'tag3', 'tag4', 'tag5', 'tag6')
)
This leads to slow queries of "query time 3-4s; lock time about 0.000090s; with about 200 rows examined".
If I split the SELECT
queries, each of them will be quite fast; but this will increase the number of queries which is not good at high concurrency.
Is it the usual situation, or something is wrong with my coding?
In MySQL, doing a subquery like this is a "correlated query". This means that the results of the outer SELECT
depend on the result of the inner SELECT
. The outcome is that your inner query is executed once per row, which is very slow.
You should refactor this query; whether you join twice or use two queries is mostly irrelevant. Joining twice would give you:
SELECT something
FROM posts
INNER JOIN tag_map ON tag_map.id = posts.id
INNER JOIN tags ON tags.tag_id = tag_map.tag_id
WHERE tags.tag IN ('tag1', ...)
For more information, see the MySQL manual on converting subqueries to JOINs.
Tip: EXPLAIN SELECT
will show you how the optimizer plans on handling your query. If you see DEPENDENT SUBQUERY
you should refactor, these are mega-slow.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With