I would like to create a copy of a database with approximately 40 InnoDB tables and around 1.5GB of data with mysqldump and MySQL 5.1. What are the best parameters (ie: --single-transaction) that will result in the quickest dump and load of the data? As well, when loading the data into the second DB, is it quicker to: 1) pipe the results directly to the second MySQL server instance and use the --compress option or 2) load it from a text file (ie: mysql < my_sql_dump.sql)

<h3>QUICKLY dumping a quiesced database:</h3> Using the "-T " option with mysqldump results in lots of .sql and .txt files in the specified directory. This is ~50% faster for dumping large tables than a single .sql file with INSERT statements (takes 1/3 less wall-clock time). Additionally, there is a huge benefit when restoring if you can load multiple tables in parallel, and saturate multiple cores. On an 8-core box, this could be as much as an 8X difference in wall-clock time to restore the dump, on top of the efficiency improvements provided by "-T". Because "-T" causes each table to be stored in a separate file, loading them in parallel is easier than splitting apart a massive .sql file. Taking the strategies above to their logical extreme, one could create a script to dump a database widely in parallel. Well, that's exactly what the Maakit mk-parallel-dump (see http://www.maatkit.org/doc/mk-parallel-dump.html) and mk-parallel-restore tools are; perl scripts that make multiple calls to the underlying mysqldump program. However, when I tried to use these, I had trouble getting the restore to complete without duplicate key errors that didn't occur with vanilla dumps, so keep in mind that your milage may vary. <h3>Dumping data from a LIVE database (w/o service interruption):</h3> The --single-transaction switch is very useful for taking a dump of a live database without having to quiesce it or taking a dump of a slave database without having to stop slaving. Sadly, -T is not compatible with --single-transaction, so you only get one. Usually, taking the dump is much faster than restoring it. There is still room for a tool that take the incoming monolithic dump file and breaks it into multiple pieces to be loaded in parallel. To my knowledge, such a tool does not yet exist. <hr> <h3>Transferring the dump over the Network is usually a win</h3> To listen for an incoming dump on one host run: <pre class="prettyprint"><code>nc -l 7878 > mysql-dump.sql </code></pre> Then on your DB host, run <pre class="prettyprint"><code>mysqldump $OPTS | nc myhost.mydomain.com 7878 </code></pre> This reduces contention for the disk spindles on the master from writing the dump to disk slightly speeding up your dump (assuming the network is fast enough to keep up, a fairly safe assumption for two hosts in the same datacenter). Plus, if you are building out a new slave, this saves the step of having to transfer the dump file after it is finished. Caveats - obviously, you need to have enough network bandwidth not to slow things down unbearably, and if the TCP session breaks, you have to start all over, but for most dumps this is not a major concern. <hr> Lastly, I want to clear up one point of common confusion. Despite how often you see these flags in mysqldump examples and tutorials, they are superfluous because they are turned ON by default: <ul> <li><code>--opt</code></li> <li><code>--add-drop-table</code></li> <li><code>--add-locks</code></li> <li><code>--create-options</code></li> <li><code>--disable-keys</code></li> <li><code>--extended-insert</code></li> <li><code>--lock-tables</code></li> <li><code>--quick</code></li> <li> <code>--set-charset</code>. </li> </ul> From http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html: <blockquote> Use of --opt is the same as specifying --add-drop-table, --add-locks, --create-options, --disable-keys, --extended-insert, --lock-tables, --quick, and --set-charset. All of the options that --opt stands for also are on by default because --opt is on by default. </blockquote> Of those behaviors, "--quick" is one of the most important (skips caching the entire result set in mysqld before transmitting the first row), and can be with "mysql" (which does NOT turn --quick on by default) to dramatically speed up queries that return a large result set (eg dumping all the rows of a big table).

For innodb, --order-by-primary --extended-insert is usually the best combo. If your after every last bit of performance and the target box has many CPU cores, you might want to split the resulting dumpfile and do parallel inserts in many threads, up to innodb_thread_concurrency/2. Also, tweak the innodb_buffer_pool_size on the target to the max you can afford, and increase innodb_log_file_size to 128 or 256 MB (careful with this, you need to remove the old logfiles before restarting the mysql daemon otherwise it won't restart)

What's the quickest way to dump & load a MySQL InnoDB database using mysqldump?

2 Answers

QUICKLY dumping a quiesced database:

Using the "-T " option with mysqldump results in lots of .sql and .txt files in the specified directory. This is ~50% faster for dumping large tables than a single .sql file with INSERT statements (takes 1/3 less wall-clock time).

Additionally, there is a huge benefit when restoring if you can load multiple tables in parallel, and saturate multiple cores. On an 8-core box, this could be as much as an 8X difference in wall-clock time to restore the dump, on top of the efficiency improvements provided by "-T". Because "-T" causes each table to be stored in a separate file, loading them in parallel is easier than splitting apart a massive .sql file.

Taking the strategies above to their logical extreme, one could create a script to dump a database widely in parallel. Well, that's exactly what the Maakit mk-parallel-dump (see http://www.maatkit.org/doc/mk-parallel-dump.html) and mk-parallel-restore tools are; perl scripts that make multiple calls to the underlying mysqldump program. However, when I tried to use these, I had trouble getting the restore to complete without duplicate key errors that didn't occur with vanilla dumps, so keep in mind that your milage may vary.

Dumping data from a LIVE database (w/o service interruption):

The --single-transaction switch is very useful for taking a dump of a live database without having to quiesce it or taking a dump of a slave database without having to stop slaving.

Sadly, -T is not compatible with --single-transaction, so you only get one.

Usually, taking the dump is much faster than restoring it. There is still room for a tool that take the incoming monolithic dump file and breaks it into multiple pieces to be loaded in parallel. To my knowledge, such a tool does not yet exist.

Transferring the dump over the Network is usually a win

To listen for an incoming dump on one host run:

nc -l 7878 > mysql-dump.sql

Then on your DB host, run

mysqldump $OPTS | nc myhost.mydomain.com 7878

This reduces contention for the disk spindles on the master from writing the dump to disk slightly speeding up your dump (assuming the network is fast enough to keep up, a fairly safe assumption for two hosts in the same datacenter). Plus, if you are building out a new slave, this saves the step of having to transfer the dump file after it is finished.

Caveats - obviously, you need to have enough network bandwidth not to slow things down unbearably, and if the TCP session breaks, you have to start all over, but for most dumps this is not a major concern.

Lastly, I want to clear up one point of common confusion.

Despite how often you see these flags in mysqldump examples and tutorials, they are superfluous because they are turned ON by default:

--opt
--add-drop-table
--add-locks
--create-options
--disable-keys
--extended-insert
--lock-tables
--quick
--set-charset.

From http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html:

Use of --opt is the same as specifying --add-drop-table, --add-locks, --create-options, --disable-keys, --extended-insert, --lock-tables, --quick, and --set-charset. All of the options that --opt stands for also are on by default because --opt is on by default.

Of those behaviors, "--quick" is one of the most important (skips caching the entire result set in mysqld before transmitting the first row), and can be with "mysql" (which does NOT turn --quick on by default) to dramatically speed up queries that return a large result set (eg dumping all the rows of a big table).

101

answered Sep 19 '22 15:09

Dave Dopson

For innodb, --order-by-primary --extended-insert is usually the best combo. If your after every last bit of performance and the target box has many CPU cores, you might want to split the resulting dumpfile and do parallel inserts in many threads, up to innodb_thread_concurrency/2.

Also, tweak the innodb_buffer_pool_size on the target to the max you can afford, and increase innodb_log_file_size to 128 or 256 MB (careful with this, you need to remove the old logfiles before restarting the mysql daemon otherwise it won't restart)

answered Sep 19 '22 15:09

ggiroux

Related questions
                            
                                php how to store and read json data via mysql?
                            
                                Retrieve lost file using Vi in MySQL
                            
                                Specifying separate config file on mysqldump command line
                            
                                SQL: search/replace but only the first time a value appears in record
                            
                                MySQL infile ignore header row
                            
                                How to export some rows of a MySQL table from a WHERE clause?
                            
                                Mysql2::Error: Access denied for user 'test'@'localhost' to database 'depot_test'
                            
                                Django distinct group by query on two fields
                            
                                MYSQL search if a string contains special characters?
                            
                                Populating a table from query results (mysql)
                            
                                MySQL Query to get count of unique values?
                            
                                How do you match even numbers of letter or odd numbers of letter using regexp for mysql
                            
                                With mysql show tables; can I sort by table name while ignoring case?
                            
                                Hibernate encodes wrong while persisting objects [UTF-8]
                            
                                MySQL one-to-many join with Group By only returns one observation
                            
                                mysql - select true when count is greater than zero
                            
                                not allowed to return a resultset from a trigger mysql
                            
                                Mysql_install_db cannot find file
                            
                                How to migrate data from mongodb to mysql?
                            
                                Symfony 3 - An exception occurred in driver: could not find driver

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What's the quickest way to dump & load a MySQL InnoDB database using mysqldump?

Tags:

database

mysql

innodb

backup

Josh Schwartzman

People also ask