I have a 1,000,000 row .csv file that I uploaded into a table using mySQL Workbench, but I forgot to makes the dates YYYY-MM-DD
before I started, so they all uploaded as 0000-00-00
.
It took almost 8 hours to upload the million records, so I'd REALLY like to not have to do it all over again, but I can't figure out if there's a way for me to replace JUST that one column of data from the same file I originally uploaded, now that I've changed the dates to be in the correct format.
Does anyone know if this is possible?
Edit
It's WAY too long to post everything, but: here's the show create table
with some of the meat taken out:
CREATE TABLE myTable
( lineID int(11) NOT NULL AUTO_INCREMENT,
1 varchar(50) DEFAULT NULL,
2 varchar(1) DEFAULT NULL,
3 int(4) DEFAULT NULL,
4 varchar(20) DEFAULT NULL,
DATE date DEFAULT NULL,
PRIMARY KEY (lineID)
) ENGINE=InnoDB AUTO_INCREMENT=634205 DEFAULT CHARSET=utf8
Version is 5.6.20
Screenshot:
Ok. I would recommend using LOAD DATA INFILE explicitly. For those that have not used it, consider it just as a select statement for now til you see it.
Here is a nice article on performance and strategies titled Testing the Fastest Way to Import a Table into MySQL. Don't let the mysql version of the title or inside the article scare you away. Jumping to the bottom and picking up some conclusions:
The fastest way you can import a table into MySQL without using raw files is the LOAD DATA syntax. Use parallelization for InnoDB for better results, and remember to tune basic parameters like your transaction log size and buffer pool. Careful programming and importing can make a >2-hour problem became a 2-minute process. You can disable temporarily some security features for extra performance
There are also fine points in there, mainly in peer comments back and forth about secondary indexes (which you do not have). The important point for others is to add them after the fact.
I hope these links are useful. And your data comes in ... in 10 minutes (in another test table with LOAD DATA INFILE
).
About the slowest way to do it is in a programming language via a while loop, row by row. Getting faster is certainly batch, where one insert statement passes along, say, 200 to 1k rows at a time. Up substantially in performance is LOAD DATA INFILE. Fastest is raw files (what I do, but beyond the scope of talking here).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With