I have a fairly large InnoDB table which contains about 10 million rows (and counting, it is expected to become 20 times that size). Each row is not that large (131 B on average), but from time to time I have to delete a chunk of them, and that is taking ages. This is the table structure:
CREATE TABLE `problematic_table` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `taxid` int(10) unsigned NOT NULL, `blastdb_path` varchar(255) NOT NULL, `query` char(32) NOT NULL, `target` int(10) unsigned NOT NULL, `score` double NOT NULL, `evalue` varchar(100) NOT NULL, `log_evalue` double NOT NULL DEFAULT '-999', `start` int(10) unsigned DEFAULT NULL, `end` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`id`), KEY `taxid` (`taxid`), KEY `query` (`query`), KEY `target` (`target`), KEY `log_evalue` (`log_evalue`) ) ENGINE=InnoDB AUTO_INCREMENT=7888676 DEFAULT CHARSET=latin1;
Queries that delete large chunks from the table are simply like this:
DELETE FROM problematic_table WHERE problematic_table.taxid = '57';
A query like this just took almost an hour to finish. I can imagine that the index rewriting overhead makes these queries very slow.
I am developing an application that will run on pre-existing databases. I most likely have no control over server variables unless I make changes to them mandatory (which I would prefer not to), so I'm afraid suggestions that change those are of little value.
I have tried to INSERT ... SELECT
those rows that I don't want to delete into a temporary table and just dropping the rest, but as the ratio of to-delete vs. to-keep shifts towards to-keep, this is no longer a useful solution.
This is a table that may see frequent INSERT
s and SELECT
s in the future, but no UPDATE
s. Basically, it's a logging and reference table that needs to drop parts of its content from time to time.
Could I improve my indexes on this table by limiting their length? Would switching to MyISAM help, which supports DISABLE KEYS
during transactions? What else could I try to improve DELETE
performance?
Edit: One such deletion would be in the order of about one million of rows.
To delete rows more quickly, you can increase the size of the key cache by increasing the key_buffer_size system variable. See Section 5.1. 1, “Configuring the Server”. To delete all rows from a MyISAM table, TRUNCATE TABLE tbl_name is faster than DELETE FROM tbl_name .
OPTIMIZE TABLE works for InnoDB , MyISAM , and ARCHIVE tables. OPTIMIZE TABLE is also supported for dynamic columns of in-memory NDB tables. It does not work for fixed-width columns of in-memory tables, nor does it work for Disk Data tables.
Since InnoDB stores in a fixed 16K "frame" whereas MyISAM stores in frames that can vary in size up 16777212 (16M - 4), one should always expect InnoDB tables to be larger due to page splits that are required to fit data into fixed 16K "frames".
I had a similar scenario with a table with 2 million rows and a delete statement, which should delete around a 100 thousand rows - it took around 10 minutes to do so.
After I checked the configuration, I found that MySQL Server was running with default innodb_buffer_pool_size
= 8 MB (!).
After restart with innodb_buffer_pool_size
= 1.5GB, the same scenario took 10 sec.
So it looks like there is a dependency if "reordering of the table" can fit in buffer_pool or not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With