InnoDB indexes before and after importing

Tags:

I'm trying to import a large SQL file that was generated by mysqldump for an InnoDB table but it is taking a very long time even after adjusting some parameters in my.cnf and disabling AUTOCOMMIT (as well as FOREIGN_KEY_CHECKS and UNIQUE_CHECKS but the table does not have any foreign or unique keys). But I'm wondering if it's taking so long because of the several indexes in the table.

Looking at the SQL file, it appears that the indexes are being created in the CREATE TABLE statement, prior to inserting all the data. Based on my (limited) research and personal experience, I've found that it's faster to add the indexes after inserting all the data. Does it not have to check the indexes for every INSERT? I know that mysqldump does have a --disable-keys option which does exactly that – disable the keys prior to inserting, but apparently this only works with MyISAM tables and not InnoDB.

But why couldn't mysqldump not include the keys with the CREATE TABLE statement for InnoDB tables, then do an ALTER TABLE after all the data is inserted? Or does InnoDB work differently, and there is no speed difference?

Thanks!

690

asked Jun 21 '18 17:06

Altherat

1 Answers

I experimented with this concept a bit at a past job, where we needed a fast method of copying schemas between MySQL servers.

There is indeed a performance overhead when you insert to tables that have secondary indexes. Inserts need to update the clustered index (aka the table), and also update secondary indexes. The more indexes a table has, the more overhead it causes for inserts.

InnoDB has a feature called the change buffer which helps a bit by postponing index updates, but they have to get merged eventually.

Inserts to a table with no secondary indexes are faster, so it's tempting to try to defer index creation until after your data is loaded, as you describe.

Percona Server, a branch of MySQL, experimented with a mysqldump --optimize-keys option. When you use this option, it changes the output of mysqldump to have CREATE TABLE with no indexes, then INSERT all data, then ALTER TABLE to add the indexes after the data is loaded. See https://www.percona.com/doc/percona-server/LATEST/management/innodb_expanded_fast_index_creation.html

But in my experience, the net improvement in performance was small. It still takes a while to insert a lot of rows, even for tables with no indexes. Then the restore needs to run an ALTER TABLE to build the indexes. This takes a while for a large table. When you count the time of INSERTs plus the extra time to build indexes, it's only a few (low single-digit) percents faster than inserting the traditional way, into a table with indexes.

Another benefit of this post-processing index creation is that the indexes are stored more compactly, so if you need to save disk space, that's a better reason to use this technique.

I found it much more beneficial to performance to restore by loading several tables in parallel.

The new MySQL 8.0 tool mysqlpump supports multi-threaded dump.
The open-source tool mydumper supports multi-threaded dump, and also has a multi-threaded restore tool, called myloader. The worst downside of mydumper/myloader is that the documentation is virtually non-existant, so you need to be an intrepid power user to figure out how to run it.

Another strategy is to use mysqldump --tab to dump CSV files instead of SQL scripts. Bulk-loading CSV files is much faster than executing SQL scripts to restore the data. Well, it dumps an SQL file for the table definition, and a CSV for the data to import. It creates separate files for each table. You have to manually recreate the tables by loading all the SQL files (this is quick), and then use mysqlimport to load the CSV data files. The mysqlimport tool even has a --use-threads option for parallel execution.

Test carefully with different numbers of parallel threads. My experience is that 4 threads is the best. With greater parallelism, InnoDB becomes a bottleneck. But your experience may be different, depending on the version of MySQL and your server hardware's performance capacity.

The fastest restore method of all is when you use a physical backup tool, the most popular is Percona XtraBackup. This allows for fast backups and even faster restores. The backed up files are literally ready to be copied into place and used as live tablespace files. The downside is that you must shut down your MySQL Server to perform the restore.

156

answered Oct 19 '22 05:10

Bill Karwin

Related questions
                            
                                How to close sql connection when using mysql2/promise?
                            
                                Is this query possible using Criteria or DetachedCriteria Hibernate
                            
                                MYSQL View existing event script
                            
                                Use PHPUnit/CakePHP Fixtures with Point type
                            
                                Does order of columns of Multi-Column Indexes in where clause in MySQL matter?
                            
                                Can't connect to mysql pod in Kubernetes when using Secrets for password (Access denied)
                            
                                MySQL table without primary key (best practice)
                            
                                Mysql client called with `docker-compose run` vs. `docker-compose exec`
                            
                                MySQL - How to find word with the most similar beginning
                            
                                How to add MySQL table column with a default value of another column of the existing row
                            
                                Couldn't enable symbolic links for Mysql
                            
                                How to fix this java mysql exception: Communications link failure?
                            
                                Airflow connection to MySQL
                            
                                Java Prepared Statement with IS NULL
                            
                                Can you rollback a query in state 'committing alter table to storage engine'
                            
                                How can I get ids of grouped rows?
                            
                                MySQL - creating table with not null on multiple columns together
                            
                                Union of two tables in Zend 2
                            
                                .hasMany called with something that's not a subclass of Sequelize.Model
                            
                                How to model diamond like many-to-many relationship in database ERD

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

InnoDB indexes before and after importing

Tags:

sql

mysql

innodb

Altherat

People also ask

1 Answers

Bill Karwin

Recent Activity

Donate For Us