To start, a few details to describe the situation as a whole:
That said, this is the very highly used table (fifty or so INSERTs per second, many SELECTs, row-level locking is utilized) I'm running the DELETE query on:
CREATE TABLE `sales` (
`sale_id` int(32) unsigned NOT NULL auto_increment,
`start_time` int(20) unsigned NOT NULL,
`end_time` int(20) unsigned default NULL,
`identifier` char(9) NOT NULL,
`zip_code` char(5) NOT NULL,
`income` mediumint(6) unsigned NOT NULL,
PRIMARY KEY USING BTREE (`sale_id`),
UNIQUE KEY `SALE_DATA` (`ssn`,`zip_code`,`income`),
KEY `SALE_START` USING BTREE (`start_time`)
) ENGINE=InnoDB DEFAULT CHARSET=ascii ROW_FORMAT=FIXED
The DELETE
query looks like this, and is run every five minutes on cron (I'd prefer to run it once per minute):
DELETE FROM `sales` WHERE
`start_time` < UNIX_TIMESTAMP(NOW() - INTERVAL 30 MINUTE);
I've used INT
for the time field because it is apparent that MySQL has trouble using indexes with DATETIME
fields.
So this is the problem: The DELETE
query seems to run fine the majority of the time (maybe 7 out of 10 times). Other times, the query finishes quickly, but MySQL seems to get choked up for awhile afterwards. I can't exactly prove it's MySQL that is acting up, but the times the symptoms happen definitely coincides with the times that this query is run. Here are the symptoms while everything is choked up:
SHOW FULL PROCESSLIST;
, there are just a few INSERT INTO
sales...
queries running, where normally there are more than a hundred. What's abnormal here is actually the lack of any tasks in the process list, rather than there being too many. It seems MySQL stops taking connections entirely.top
, there are many Apache processes using lots of CPU.Eventually, after about a minute or two, things recover on their own without any intervention. CPU usage goes back to normal, Apache and MySQL resume normal operations.
So, what can I do? :) How can I even begin to investigate why this is happening? I need that DELETE query to run for various reasons, why do things go bonkers when it's run (but not all the time)?
hard one. This is not a response but the start of a brainstorming.
I would say, maybe, a re-Index problem on delete, on the doc we can find "delete quick" followed by "optimize table" to try avoiding the multi index-merge.
One other possibility, maybe as well, is a chain of dead lock on delete with at least one other thread, row locks could pause the delete operation, and the delete operation could pause some next row lock. And then you've got either a detected deadlock , or an undetected one and so a timeout occuring. How do you detect such concurrency aborted exceptions? Do you re-run your transactions? If you threads are doing a lot of different row locks in the same transactions chances are that the first deadlock will impact more and more threads (traffic jam).
Did you tried to lock the table in the delete transaction? Check the manual, the way of locking tables in transaction in Innodb or to get a SHARE LOCK on all rows. Maybe it will take you some time to get the table only for you but if your delete is quite fast no one will notice you've taken the table for you only for 1s.
Now even if you do not tried it before, it's maybe what the delete is doing. Check as well this doc on implicit locks, your delete query should be using the start_time index, so I'm quite sure your current delete is not locking all rows (not completly sure, they lock al analysed rows not only the rows matching the where condition), but the delete is quite certainly blocking inserts. Some examples of deadlocks with transaction performing deletes are explained. Good luck! For me it's too late to understand all the lock isolation impacts.
edit you could try to change your DELETE by an UPDATE setting a deleted=1, and perform the real delete on low usage times (if you have some). And change the client queries to check this indexed deleted status.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With