I'm using Laravel 5.7 to fetch large amounts of data (around 500k rows) from an API server and insert it into a table (call it Table A) quite frequently (at least every six hours, 24/7) - however, it's enough to insert only the changes the next time we insert (but at least 60-70% of the items will change). So this table will quickly have tens of millions of rows.
I came up with the idea to make a helper table (call it Table B) to store all the new data into it. Before inserting everything into Table A, I want to compare it to the previous data (with Laravel, PHP) from Table B - so I will only insert the records that need to be updated. Again it will usually be around 60-70% of the records.
My first question is if this above-mentioned way is the preferred way of doing it, in this situation (obviously I want to make it happen as fast as possible.) I assume that searching for an updating the records in the table would take a lot more time and it would keep the table busy / lock it. Is there a better way to achieve the same (meaning to update the records in the DB).
The second issue I'm facing is the slow insert times. Right now I'm using a local environment (16GB RAM, I7-6920HQ CPU) and MySQL is inserting the rows very slowly (about 30-40 records at a time). The size of one row is around 50 bytes.
I know it can be made a lot faster by fiddling around with InnoDB's settings. However, I'd also like to think that I can do something on Laravel's side to improve performance.
Right now my Laravel code looks like this (only inserting 1 record at a time):
foreach ($response as $key => $value)
{
DB::table('table_a')
->insert(
[
'test1' => $value['test1'],
'test2' => $value['test2'],
'test3' => $value['test3'],
'test4' => $value['test4'],
'test5' => $value['test5'],
]);
}
$response
is a type of array.
So my second question: is there any way to increase the inserting time of the records to something like 50k/second - both on the Laravel application layer (by doing batch inserts) and MySQL InnoDB level (changing the config).
Current InnoDB settings:
innodb_buffer_pool_size = 256M
innodb_log_file_size = 256M
innodb_thread_concurrency = 16
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = normal
innodb_use_native_aio = true
MySQL version is 5.7.21.
If I forgot to tell/add anything, please let me know in a comment and I will do it quickly.
Edit 1: The server that I'm planning to use will have SSD on it - if that makes any difference. I assume MySQL inserts will still count as I/O.
autocommit
and manually commit at end of insertionAccording to MySQL 8.0 docs. (8.5.5 Bulk Data Loading for InnoDB Tables)
You can increase the INSERT speed by turning off auto commit:
- When importing data into InnoDB, turn off autocommit mode, because it performs a log flush to disk for every insert. To disable autocommit during your import operation, surround it with SET autocommit and COMMIT statements:
SET autocommit=0; ... SQL import statements ... COMMIT;
Other way to do it in Laravel is using Database Transactions:
DB::beginTransaction()
// Your inserts here
DB::commit()
INSERT
with multiple VALUES
Also according to MySQL 8.0 docs (8.2.5.1 Optimizing INSERT Statements) you can optimize INSERT speed by using multiple VALUES
on a single insert statement.
To do it with Laravel, you can just pass an array of values to the insert()
method:
DB::table('your_table')->insert([
[
'column_a'=>'value',
'column_b'=>'value',
],
[
'column_a'=>'value',
'column_b'=>'value',
],
[
'column_a'=>'value',
'column_b'=>'value',
],
]);
According to the docs, it can be many times faster.
Both MySQL docs links that I put on this post have tons of tips on increasing INSERT speed.
If your data source is (or can be) a CSV file, you can run it a lot faster using mysqlimport
to import the data.
Using PHP and Laravel to import data from a CSV file is an overhead, unless you need to do some data processing before inserting.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With