Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inserting millions of records into MySQL database using Python

I have a txt file with about 100 million records (numbers). I am reading this file in Python and inserting it to MySQL database using simple insert statement from python. But its taking very long and looks like the script wouldn't ever finish. What would be the optimal way to carry out this process ? The script is using less than 1% of memory and 10 to 15% of CPU.

Any suggestions to handle such large data and insert it efficiently into database, would be greatly appreciated.

Thanks.

like image 667
jcoder12 Avatar asked Jan 08 '23 03:01

jcoder12


1 Answers

The fastest way to insert rows into a table is with LOAD DATA INFILE statement.

Reference: https://dev.mysql.com/doc/refman/5.6/en/load-data.html

Doing individual INSERT statements to insert one row at a time, RBAR (row by agonizing row) is tediously slow, because of all the work that the database has to go through to execute a statement... parsing for syntax, for semantics, preparing execution plan, obtaining and releasing locks, writing to the binary log, ...

If you have to do INSERT statements, then you could make use of MySQL multi-row insert, that would be faster.

  INSERT INTO mytable (fee, fi, fo, fum) VALUES 
   (1,2,3,'shoe')
  ,(4,5,6,'sock')
  ,(7,8,9,'boot') 

If you insert four rows at a time, that's a 75% reduction in the number of statements that need to be executed.

like image 95
spencer7593 Avatar answered Jan 10 '23 16:01

spencer7593