I have written a program in C to parse large XML files and then create files with insert statements. Some other process would ingest the files into a MySQL database. This data will serve as a indexing service so that users can find documents easily.
I have chosen InnoDB for the ability of row-level locking. The C program will be generating any where from 500 to 5 million insert statements on a given invocation.
What is the best way to get all this data into the database as quickly as possible? The other thing to note is that the DB is on a separate server. Is it worth moving the files over to that server to speed up inserts?
EDIT: This table won't really be updated, but rows will be deleted.
LOAD DATA (all forms) is more efficient than INSERT because it loads rows in bulk.
To improve insert performance you should use batch inserts. insert into table my_table(col1, col2) VALUES (val1_1, val2_1), (val1_2, val2_2); Storing records to a file and using load data infile yields even better results (best in my case), but it requires more effort.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With