I have a browse category query that im trying to optimize. Im ending up with Using temporary; Using filesort in the explain and the query is slow on a category with say 60,000 rows. If i remove the Order By clauses the query runs very fast .05 seconds to do 60,000 rows. With the Order By clauses its very slow around 5 seconds. Parts contains some 500,000 rows as does Parts_Category.
I have a group index on Parts (status, level, warehouse, updated) called sort_index
At the top of the explain I have | ALL | Using temporary; Using filesort
All the other indexes are showing OK. Can somebody please tell me what the problem might be? Im out of ideas. Maybe i should rearrange this query so i can get better performance perhaps?
query.
SELECT Parts.*, Image.type, Image.width, Image.height,
(SELECT name FROM Location_State WHERE id = Parts.state_id) AS state,
(SELECT name FROM Location_Region WHERE id = Parts.region_id) AS region,
(SELECT start_date FROM Promotion WHERE id = Parts.promotion_id) AS promotion_start_date,
(SELECT end_date FROM Promotion WHERE id = Parts.promotion_id) AS promotion_end_date
FROM ( SELECT parts_id FROM Parts_Category WHERE Parts_Category.category_id = '40'
UNION SELECT parts_id FROM Parts_Category WHERE Parts_Category.main_category_id = '40') cid
LEFT JOIN Image ON Parts.image_id = Image.id
JOIN Parts ON Parts.id = cid.parts_id AND Parts.status = 'A'
ORDER BY Parts.level DESC, Parts.warehouse DESC, Parts.updated DESC LIMIT 0, 15
Table structure for table Parts Field Type Null Default id int(11) No auto_increment image_id int(11) Yes 0 gallery_id int(11) Yes 0 image_count int(3) Yes 0 promotion_id int(11) Yes 0 country_id int(11) Yes NULL state_id int(11) Yes NULL region_id int(11) Yes NULL city_id int(11) Yes NULL area_id int(11) Yes NULL updated datetime Yes 0000-00-00 00:00:00 entered datetime Yes 0000-00-00 00:00:00 renewal_date date Yes 0000-00-00 discount_id varchar(10) Yes NULL title varchar(100) Yes search_title varchar(255) Yes warehouse varchar(50) Yes url varchar(255) Yes display_url varchar(255) Yes friendly_url varchar(100) Yes NULL description varchar(255) Yes keywords varchar(1000) Yes NULL attachment_file varchar(255) Yes attachment_caption varchar(255) Yes status char(1) Yes level tinyint(3) Yes 0 worldwide tinyint(1) Yes 0 random_number int(11) Yes NULL reminder tinyint(4) Yes NULL category_search varchar(1000) Yes video_snippet varchar(1000) Yes importID int(11) Yes 0 Indexes PRIMARY 518623 id random_number INDEX 32201 random_number country_id INDEX 1 country_id state_id INDEX 8 state_id region_id INDEX 5 region_id renewal_date INDEX 1 renewal_date worldwide INDEX 1 worldwide friendly_url INDEX 518623 friendly_url promotion_id INDEX 1 promotion_id city_id INDEX 1 city_id area_id INDEX 1 area_id zip_code INDEX 2790 zip_code importID INDEX 518623 importID image_id INDEX 10 image_id -------------- index_browse_category INDEX 52 level status warehouse updated ----------------- keywords FULLTEXT 1 description keywords category_search Parts_Category id int(11) No auto_increment parts_id int(11) No 0 category_id int(11) No 0 main_category_id int(10) No 0 Index PRIMARY PRIMARY 519330 id category_id INDEX 519330 category_id parts_id main_category_id INDEX 519330 main_category_id parts_id
Remove any unnecessary indexes on the table, paying particular attention to UNIQUE indexes as these disable change buffering. Don't use a UNIQUE index unless you need it; instead, employ a regular INDEX. Take a look at your slow query log every week or two. Pick the slowest three queries and optimize those.
The most general way to satisfy a GROUP BY clause is to scan the whole table and create a new temporary table where all rows from each group are consecutive, and then use this temporary table to discover groups and apply aggregate functions (if any).
Try rewriting your query as this:
SELECT p.*, i.type, i.width, i.height,
(SELECT name FROM Location_State WHERE id = p.state_id) AS state,
(SELECT name FROM Location_Region WHERE id = p.region_id) AS region,
(SELECT start_date FROM Promotion WHERE id = p.promotion_id) AS promotion_start_date,
(SELECT end_date FROM Promotion WHERE id = p.promotion_id) AS promotion_end_date
FROM parts p
LEFT JOIN
image i
ON i.id = p.image_id
WHERE EXISTS (
SELECT NULL
FROM Parts_Category pc
WHERE pc.category_id = '40'
AND pc.parts_id = p.id
UNION ALL
SELECT NULL
FROM Parts_Category pc
WHERE pc.main_category_id = '40'
AND pc.parts_id = p.id
)
AND p.status = 'A'
ORDER BY
p.status DESC, p.level DESC, p.warehouse DESC, p.updated DESC
LIMIT 15
You need the following indexes for this to work efficiently:
parts (status, level, warehouse, updated) -- this one you have
parts_category (category_id, parts_id)
parts_category (main_category_id, parts_id)
Update:
I just created the tables as this:
DROP TABLE IF EXISTS `test`.`image`;
CREATE TABLE `test`.`image` (
`id` int(11) NOT NULL,
`type` int(11) NOT NULL,
`width` int(11) NOT NULL,
`height` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`location_region`;
CREATE TABLE `test`.`location_region` (
`id` int(11) NOT NULL,
`name` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`location_state`;
CREATE TABLE `test`.`location_state` (
`id` int(11) NOT NULL,
`name` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`parts`;
CREATE TABLE `test`.`parts` (
`id` int(11) NOT NULL,
`status` char(1) NOT NULL,
`level` int(11) NOT NULL,
`warehouse` int(11) NOT NULL,
`updated` int(11) NOT NULL,
`state_id` int(11) NOT NULL,
`region_id` int(11) NOT NULL,
`promotion_id` int(11) NOT NULL,
`image_id` int(11) NOT NULL DEFAULT '1',
PRIMARY KEY (`id`),
KEY `status` (`status`,`level`,`warehouse`,`updated`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`parts_category`;
CREATE TABLE `test`.`parts_category` (
`id` int(11) NOT NULL,
`parts_id` int(11) NOT NULL,
`category_id` int(11) NOT NULL,
`main_category_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `ix_pc_cat_parts` (`category_id`,`parts_id`),
KEY `ix_pc_main_parts` (`main_category_id`,`parts_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`promotion`;
CREATE TABLE `test`.`promotion` (
`id` int(11) NOT NULL,
`start_date` datetime NOT NULL,
`end_date` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
and filled them with sample data:
INSERT
INTO parts
SELECT id,
CASE WHEN RAND() < 0.1 THEN 'A' ELSE 'B' END,
RAND() * 100,
RAND() * 100,
RAND() * 100,
RAND() * 50,
RAND() * 50,
RAND() * 50,
RAND() * 50
FROM t_source
LIMIT 500000;
INSERT
INTO parts_category
SELECT id,
id,
RAND() * 100,
RAND() * 100
FROM t_source
LIMIT 500000;
INSERT
INTO location_state
SELECT id, CONCAT('State ', id)
FROM t_source
LIMIT 1000;
INSERT
INTO location_region
SELECT id, CONCAT('Region ', id)
FROM t_source
LIMIT 1000;
INSERT
INTO promotion
SELECT id,
'2009-07-22' - INTERVAL RAND() * 5 - 20 DAY,
'2009-07-22' - INTERVAL RAND() * 5 DAY
FROM t_source
LIMIT 1000;
The query above runs for 30 milliseconds
and yields the following plan:
1, 'PRIMARY', 'p', 'ref', 'status', 'status', '3', 'const', 107408, 'Using where'
1, 'PRIMARY', 'i', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.image_id', 1, ''
6, 'DEPENDENT SUBQUERY', 'pc', 'ref', 'ix_pc_cat_parts', 'ix_pc_cat_parts', '8', 'const,test.p.id', 1, 'Using index'
7, 'DEPENDENT UNION', 'pc', 'ref', 'ix_pc_main_parts', 'ix_pc_main_parts', '8', 'const,test.p.id', 1, 'Using index'
, 'UNION RESULT', '<union6,7>', 'ALL', '', '', '', '', , ''
5, 'DEPENDENT SUBQUERY', 'Promotion', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.promotion_id', 1, ''
4, 'DEPENDENT SUBQUERY', 'Promotion', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.promotion_id', 1, ''
3, 'DEPENDENT SUBQUERY', 'Location_Region', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.region_id', 1, ''
2, 'DEPENDENT SUBQUERY', 'Location_State', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.state_id', 1, ''
As you can see, no temporary
, no filesort
, everything's very fast.
To help you anymore, I just need to see how your tables are defined.
John, the problem is your query is constructed such that it is selecting from a derived table. The derived table cannot benefit from your indices. Try updating your query as follows:
SELECT
Parts.*,
Image.type, Image.width, Image.height,
Location_State.name AS state,
Location_Region.name AS region,
Promotion.start_date AS promotion_start_date,
Promotion.end_date AS promotion_end_date
FROM Parts
LEFT JOIN Image ON Parts.image_id = Image.id
LEFT JOIN Location_State ON Parts.state_id = Location_State.id
LEFT JOIN Location_Region ON Parts.state_id = Location_Region.id
LEFT JOIN Promotion ON Parts.promotion_id = Promotion.id
INNER JOIN Parts_Category ON (Parts_Category.category_id = 40 OR Parts_Category.main_category_id = 40)
WHERE Parts.status = 'A'
GROUP BY Parts.id
ORDER BY Parts.level DESC, Parts.warehouse DESC, Parts.updated DESC LIMIT 0, 15
Note, if you don't need LEFT JOIN's for your Location_State, Location_Region, Promotion tables, then use an INNER JOIN instead. It will likely perform better.
To further assist in optimizing this query, please provide the following:
SHOW CREATE TABLE Parts;
If the rewritten query I provided works the same as your example (it should), then also provide:
EXPLAIN <my query here>\G
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With