I've been working with importing large CSV files of data; usually less than 100,000 records. I'm working with PHP and MySQL (InnoDB tables). I needed to use PHP to transform some fields and do some text processing prior to the MySQL INSERT
s (part of process_note_data()
in code below). MySQL's LOAD DATA
was not feasible, so please do not suggest it.
I recently tried to improve the speed of this process by using MySQL transactions using START TRANSACTION
and COMMIT
. The performance increase was surprising. Processing time(s) dropped by a factor of 20. So, a 20 minute process only took about 1 minute.
QUESTIONS.
1.) Does anyone understand why there was such performance increase (20 mins to 1 min)?
2.) Should I be concerned about how big the transaction may get with 100,000 records?
3.) Should I be concerned with a large number of inserts and/or updates in the transaction?
/* * Customer Notes Data: * Rows are either a meeting, call or note! */ $row = 1; $data = array(); $fields = array(); $line = ''; $db->query('SET autocommit=0;'); $db->query('START TRANSACTION;'); if (($handle = fopen("modules/".$currentModule."/Data/customernote.csv", "r")) !== FALSE) { while (($data = fgetcsv($handle, 4096, ',', '"')) !== FALSE && $row < 999000) { //Row 1 - CSV header row with field names if ($row == 1) { $csv_fields = $data; } elseif ($row > 1) { $fields = $this->process_note_data($data, $csv_fields, $row); } $row++; } // end while fclose($handle); } $db->query('COMMIT;'); $db->query('SET autocommit=1;');
Note: The text/field processing is done in the call to $this->process_note_data()
which then calls another helper class that has the INSERT
statement code. I didn't have enough room to include all of the code. $db->query()
is a typical database object for MySQL queries.
MySQL transaction allows you to execute a set of MySQL operations to ensure that the database never contains the result of partial operations. In a set of operations, if one of them fails, the rollback occurs to restore the database to its original state.
A COMMIT means that the changes made in the current transaction are made permanent and become visible to other sessions. A ROLLBACK statement, on the other hand, cancels all modifications made by the current transaction. Both COMMIT and ROLLBACK release all InnoDB locks that were set during the current transaction.
In MySQL, the transactions begin with the statement BEGIN WORK and end with either a COMMIT or a ROLLBACK statement. The SQL commands between the beginning and ending statements form the bulk of the transaction.
If autocommit mode is enabled, each SQL statement forms a single transaction on its own. By default, MySQL starts the session for each new connection with autocommit enabled, so MySQL does a commit after each SQL statement if that statement did not return an error.
My own little test in .Net (4 fields pr. records):
INSERT 1 record, no transaction:60 ms
INSERT 1 record, using transaction:158 ms
INSERT 200 records using transactions, commit after each record:17778 ms
INSERT 200 records using no transactions:4940 ms
INSERT 200 records using transactions, only commit after last record:4552 ms
INSERT 1000 records using transactions, only commit after last record:21795 ms
Client in Denmark, server in Belgium (Google cloud f1-micro).
I meant to put this in a comment but the formatting is not good....so here is my apology in advance ;-)
Please check this link:
https://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-transaction-management.html
InnoDB must flush the log to disk at each transaction commit if that transaction made modifications to the database. When each change is followed by a commit (as with the default autocommit setting), the I/O throughput of the storage device puts a cap on the number of potential operations per second.
Big transactions may affect performance during commit (check above)
Only in case of rollback, however it may be optimized using some settings (check the link)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With