Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas sql update efficiently

I am using python pandas to load data from a MySQL database, change, then update another table. There are a 100,000+ rows so the UPDATE query's take some time.

Is there a more efficient way to update the data in the database than to use the df.iterrows() and run an UPDATE query for each row?

like image 643
Mantis Avatar asked Mar 18 '26 09:03

Mantis


1 Answers

The problem here is not pandas, it is the UPDATE operations. Each row will fire its own UPDATE query, meaning lots of overhead for the database connector to handle.

You are better off using the df.to_csv('filename.csv') method for dumping your dataframe into CSV, then read that CSV file into your MySQL database using the LOAD DATA INFILE

Load it into a new table, then DROP the old one and RENAME the new one to the old ones name.

Furthermore, I suggest you do the same when loading data into pandas. Use the SELECT INTO OUTFILE MySQL command and then load that file into pandas using the pd.read_csv() method.

like image 60
firelynx Avatar answered Mar 19 '26 22:03

firelynx



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!