Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Upload large CSV file approx 10,000,000 records in mysql table also it contain duplicate rows

I want to upload a large csv file approx 10,000,000 records in mysql table which also contain same or more no. of records and also some duplicate records. I tried Local data infile but it is also taking more time. How can I resolve this without waiting for a long time. If it can't be resolved then how can I do it with AJAX to send some records and process it at a time and will do it till the whole csv get uploaded/proccessed.

like image 578
akashdeep Avatar asked Sep 28 '11 20:09

akashdeep


1 Answers

LOAD DATA INFILE isn't going to be beat speed-wise. There are a few things you can do to speed it up:

  • Drop or disable some indexes (but of course, you'll get to wait for them to build after the load. But this is often faster). If you're using MyISAM, you can ALTER TABLE *foo* DISABLE KEYS, but InnoDB doesn't support that, unfortunately. You'll have to drop them instead.
  • Optimize your my.cnf settings. In particular, you may be able to disable a lot of safety things (like fsync). Of course, if you take a crash, you'll have to restore a backup and start the load over again. Also, if you're running the default my.cnf, last I checked its pretty sub-optimal for a database machine. Plenty of tuning guides are around.
  • Buy faster hardware. Or rent some (e.g., try a fast Amazon ECC instance).
  • As @ZendDevel mentions, consider other data storage solutions, if you're not locked into MySQL. For example, if you're just storing a list of telephone numbers (and some data with them), a plain hash table is going to be many times faster.

If the problem is that its killing a database performance, you can split your CSV file into multiple CSV files, and load them in chunks.

like image 188
derobert Avatar answered Oct 13 '22 02:10

derobert