I have the following query:
SELECT
fruit.date,
fruit.name,
fruit.reason,
fruit.id,
fruit.notes,
food.name
FROM
fruit
LEFT JOIN
food_fruits AS ff ON fruit.fruit_id = ff.fruit_id AND ff.type='fruit'
LEFT JOIN
food USING (food_id)
LEFT JOIN
fruits_sour AS fs ON fruits.id = fs.fruit_id
WHERE
(fruit.date < DATE_SUB(NOW(), INTERVAL 180 DAY))
AND (fruit.`status` = 'Rotten')
AND (fruit.location = 'USA')
AND (fruit.size = 'medium')
AND (fs.fruit_id IS NULL)
ORDER BY `food.name` asc
LIMIT 15 OFFSET 0
And all the indexes you could ever want, including the following which are being used:
fruit - fruit_filter (size, status, location, date)
food_fruits - food_type (type)
food - food (id)
fruits_sour - fruit_id (fruit_id)
I even have indexes which I thought would work better which are not being used:
food_fruits - fruit_key (fruit_id, type)
food - id_name (food_id, name)
The ORDER BY
clause is causing a temporary
table and filesort
to be used, unfortunately. Without that, the query runs lickety-split. How can I get this query to not need to filesort
? What am I missing?
EDIT:
The Explain:
The reason for this is your ORDER BY
clause which is done on the field which is not part of index used for this query. The engine can run the query using the fruit_filter
index, but then it has to sort on the different field, and that's when filesort
comes into play (which basically means "sort without using index", thanks to the reminder in comments).
I don't know what times you are getting as a result, but if the difference is a lot, then I would create a temporary table with intermediate results and sorted it afterwards.
(By the way, i am not sure why do you use LEFT JOIN
instead of INNER JOIN
and why do you use food_fruits
- answered in comments)
UPDATE.
Try subquery approach, may be (untested), which splits sorting from pre-filtering:
SELECT
fr.date,
fr.name,
fr.reason,
fr.id,
fr.notes,
food.name
FROM
(
SELECT
fruit.date,
fruit.name,
fruit.reason,
fruit.id,
fruit.notes,
FROM
fruit
LEFT JOIN
fruits_sour AS fs ON fruit.id = fs.fruit_id
WHERE
(fruit.date < DATE_SUB(NOW(), INTERVAL 180 DAY))
AND (fruit.`status` = 'Rotten')
AND (fruit.location = 'USA')
AND (fruit.size = 'medium')
AND (fs.fruit_id IS NULL)
) as fr
LEFT JOIN
food_fruits AS ff ON fr.fruit_id = ff.fruit_id AND ff.type='fruit'
LEFT JOIN
food USING (food_id)
ORDER BY `food.name` asc
LIMIT 15 OFFSET 0
Your ORDER BY ... LIMIT
clauses require some sorting, you know. The trick to optimizing performance is to ORDER BY ... LIMIT
the minimal set of columns, and then build your full result set based on the chosen fifteen rows. So let's try for a minimal set of columns in a subquery.
SELECT fruit.id,
food.name
FROM fruit
LEFT JOIN food_fruits AS ff ON fruit.fruit_id = ff.fruit_id
AND ff.type='fruit'
LEFT JOIN food USING (food_id)
LEFT JOIN fruits_sour AS fs ON fruits.id = fs.fruit_id
WHERE fruit.date < DATE_SUB(NOW(), INTERVAL 180 DAY)
AND fruit.`status` = 'Rotten'
AND fruit.location = 'USA'
AND fruit.size = 'medium'
AND fs.fruit_id IS NULL
ORDER BY food.name ASC
LIMIT 15 OFFSET 0
This query gives you the fifteen top ids and their names.
I would add id
to the end of your existing fruit_filter
index to give (size, status, location, date, id)
. That will make it into a compound covering index, and allow your filtering query to be satisfied entirely from the index.
Other than that, it's going to be hard to optimize this using more or different indexes because so much of the query is driven by other factors, like the LEFT JOIN ... IS NULL
join-fail criterion you have applied.
Then you can join this subquery to your fruits table to pull the full result set.
That will look like this when it's all done.
SELECT fruit.date,
fruit.name,
fruit.reason,
fruit.id,
fruit.notes,
list.name
FROM fruit
JOIN (
SELECT fruit.id,
food.name
FROM fruit
LEFT JOIN food_fruits AS ff ON fruit.fruit_id = ff.fruit_id
AND ff.type='fruit'
LEFT JOIN food USING (food_id)
LEFT JOIN fruits_sour AS fs ON fruits.id = fs.fruit_id
WHERE fruit.date < DATE_SUB(NOW(), INTERVAL 180 DAY)
AND fruit.`status` = 'Rotten'
AND fruit.location = 'USA'
AND fruit.size = 'medium'
AND fs.fruit_id IS NULL
ORDER BY food.name ASC
LIMIT 15 OFFSET 0
) AS list ON fruit.id = list.id
ORDER BY list.name
Do you see how this goes? In the subquery you sling around just enough data to identify which tiny subset of rows you want to retrieve. Then, you join that subquery to your main table to pull out all your data. Limiting the row length in the stuff you have to sort helps performance because MySQL can sort it its sort buffer, rather than having to do a more elaborate and slower sort / merge operation. (But, you can't tell from EXPLAIN whether it will do this or not.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With