I have a txt file with about 100 million records (numbers). I am reading this file in Python and inserting it to MySQL database using simple insert statement from python. But its taking very long and looks like the script wouldn't ever finish. What would be the optimal way to carry out this process ? The script is using less than 1% of memory and 10 to 15% of CPU.
Any suggestions to handle such large data and insert it efficiently into database, would be greatly appreciated.
Thanks.
The fastest way to insert rows into a table is with LOAD DATA INFILE
statement.
Reference: https://dev.mysql.com/doc/refman/5.6/en/load-data.html
Doing individual INSERT
statements to insert one row at a time, RBAR (row by agonizing row) is tediously slow, because of all the work that the database has to go through to execute a statement... parsing for syntax, for semantics, preparing execution plan, obtaining and releasing locks, writing to the binary log, ...
If you have to do INSERT statements, then you could make use of MySQL multi-row insert, that would be faster.
INSERT INTO mytable (fee, fi, fo, fum) VALUES
(1,2,3,'shoe')
,(4,5,6,'sock')
,(7,8,9,'boot')
If you insert four rows at a time, that's a 75% reduction in the number of statements that need to be executed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With