My mysql DB has become CPU hungry trying to execute a particularly slow query. When I do an explain, mysql says "Using where; Using temporary; Using filesort". Please help deciphering and solving this puzzle.
Table structure:
CREATE TABLE `topsources` (
`USER_ID` varchar(255) NOT NULL,
`UPDATED_TIME` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`URL_ID` int(11) NOT NULL,
`SOURCE_SLUG` varchar(100) NOT NULL,
`FEED_PAGE_URL` varchar(255) NOT NULL,
`CATEGORY_SLUG` varchar(100) NOT NULL,
`REFERRER` varchar(2048) DEFAULT NULL,
PRIMARY KEY (`USER_ID`,`URL_ID`),
KEY `USER_ID` (`USER_ID`),
KEY `FEED_PAGE_URL` (`FEED_PAGE_URL`),
KEY `SOURCE_SLUG` (`SOURCE_SLUG`),
KEY `CATEGORY_SLUG` (`CATEGORY_SLUG`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
The table has 370K rows...sometimes higher. The below query takes 10+ seconds.
SELECT topsources.SOURCE_SLUG, COUNT(topsources.SOURCE_SLUG) AS VIEW_COUNT
FROM topsources
WHERE CATEGORY_SLUG = '/newssource'
GROUP BY topsources.SOURCE_SLUG
HAVING MAX(CASE WHEN topsources.USER_ID = 'xxxx' THEN 1 ELSE 0 END) = 0
ORDER BY VIEW_COUNT DESC;
Here's the extended explain:
+----+-------------+------------+------+---------------+---------------+---------+-------+--------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------+---------------+---------------+---------+-------+--------+----------+----------------------------------------------+
| 1 | SIMPLE | topsources | ref | CATEGORY_SLUG | CATEGORY_SLUG | 302 | const | 160790 | 100.00 | Using where; Using temporary; Using filesort |
+----+-------------+------------+------+---------------+----
-----------+---------+-------+--------+----------+----------------------------------------------+
Is there a way to improve this query? Also, are there any mysql settings that can help in reducing CPU load? I can allocate more memory that's available on my server.
The most likely thing to help the query is an index on CATEGORY_SLUG, especially if it takes on many values. (That is, if the query is highly selective.) The query needs to read the entire table to get the results -- although 10 seconds seems like a long time.
I don't think the HAVING clause would be affecting the query processing.
Does the query take just as long if you run it two times in a row?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With