Our team just spent the last week debugging and trying to find the source of many mysql lock timeouts and many extremely long running queries. In the end it appears that this query is the culprit.
mysql> explain
SELECT categories.name AS cat_name,
COUNT(distinct items.id) AS category_count
FROM `items`
INNER JOIN `categories` ON `categories`.`id` = `items`.`category_id`
WHERE `items`.`state` IN ('listed', 'reserved')
AND (items.category_id IS NOT NULL)
GROUP BY categories.name
ORDER BY category_count DESC
LIMIT 10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: items
type: range
possible_keys: index_items_on_category_id,index_items_on_state
key: index_items_on_category_id
key_len: 5
ref: NULL
rows: 119371
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: categories
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: production_db.items.category_id
rows: 1
Extra:
2 rows in set (0.00 sec)
I can see that it is doing a nasty table scan and creating a temporary table to run.
Why would this query cause database response times to go up by a factor of ten and some queries that usually take 40-50ms (updates on items table), to explode to 50,000 ms and higher at times?
The common causes are: The offensive transaction is not fast enough to commit or rollback the transaction within innodb_lock_wait_timeout duration. The offensive transaction is waiting for row lock to be released by another transaction.
Lock timeout detection is a database manager feature that prevents applications from waiting indefinitely for a lock to be released.
If a transaction is waiting for a lock, it is in a LOCK WAIT state. (The INFORMATION_SCHEMA INNODB_TRX table indicates transaction state values.) The Performance Schema data_locks table holds one or more rows for each LOCK WAIT transaction, indicating any lock requests that prevent its progress.
Is hard to tell without more information like
My guess would be that the query is too slow and its running inside a transaction (which it probably is since you have this problem) and is probably issuing range-locks on the items table which cannot allow writes to proceed hence slowing the updates till they can get a lock on the table.
And I have a couple of comments based on what I can see from your query and execution plan:
1) Your items.state would probably be better as a catalog, instead of having the string on every row in items, this is for space efficiency and comparing IDs is way faster than comparing strings (regardless of whatever optimizations the engine may do).
2) I am guessing items.state is a column with low cardinality (few unique values), hence an index in that column is probably hurting you more than helping you. Every index adds over head when inserting/deleting/updating rows since the indexes have to be mantained, this particular index probably is not used that much to be worthwhile. Of course, I am just guessing, it depends on the rest of the queries.
SELECT
; Grouping by name, means comparing strings.
categories.name AS cat_name,
; No need for distinct, the same item.id cannot belong to different categories
COUNT(distinct items.id) AS category_count
FROM `items`
INNER JOIN `categories` ON `categories`.`id` = `items`.`category_id`
WHERE `items`.`state` IN ('listed', 'reserved')
; Not needed, the inner join gets rid of items with no category_id
AND (items.category_id IS NOT NULL)
GROUP BY categories.name
ORDER BY category_count DESC
LIMIT 10\G
The way this query is structured is basically having to scan the entire items table since its using the category_id index, then filtering by the where clause, then, joining with the category table, which means an index seek on the primary key (categories.id) index per item row in the items result set. Then grouping by name (using strings comparison) to count, then getting rid of everything but 10 of the results.
I would write the query like:
SELECT categories.name, counts.n
FROM (SELECT category_id, COUNT(id) n
FROM items
WHERE state IN ('listed', 'reserved') AND category_id is not null
GROUP BY category_id ORDER BY COUNT(id) DESC LIMIT 10) counts
JOIN categories on counts.category_id = categories.id
ORDER BY counts.n desc
(I am sorry if the syntax ain't perfect I am not running MySQL)
With this query what the engine will probably do is :
Use the items.state index to get the 'listed', 'reserved' items and group by category_id comparing numbers, not strings then getting only the 10 topmost counts, then join with categories to get the name (but using only 10 index seeks).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With