Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In MySQL, is it faster to delete and then insert or is it faster to update existing rows?

First of all, let me just say that I'm using the PHP framework Yii, so I'd like to stay within its defined set of SQL statement if possible. I know I could probably create one huge long SQL statement that would do everything, but I'd rather not go there.

OK, imagine I have a table Users and a table FavColors. Then I have a form where users can select their color preferences by checking one or more checkboxes from a large list of possible colors.

Those results are stored as multiple rows in the FavColors table like this (id, user_id, color_id).

Now imagine the user goes in and changes their color preference. In this scenario, what would be the most efficient way to get the new color preferences into the database?

Option 1:

  • Do a mass delete of all rows where user_id matches
  • Then do a mass insert of all new rows

Option 2:

  • Go through each current row to see what's changed, and update accordingly
  • If more rows need to be inserted, do that.
  • If rows need to be deleted, do that.

I like option one because it only requires two statements, but something just feels wrong about deleting a row just to potentially put back almost the exact same data in. There's also the issue of making the ids auto-increment to higher values more quickly, and I don't know if that should be avoided whenever possible.

Option 2 will require a lot more programming work, but would prevent situations where I'd delete a row just to create it again. However, adding more load in PHP may not be worth the decrease in load for MySQL.

Any thoughts? What would you all do?

like image 625
Philip Walton Avatar asked Oct 26 '10 02:10

Philip Walton


2 Answers

UPDATE is by far much faster. When you UPDATE, the table records are just being rewritten with new data. And all this must be done again on INSERT.

When you DELETE, the indexes should be updated (remember, you delete the whole row, not only the columns you need to modify) and data blocks may be moved (if you hit the PCTFREE limit). Also deleting and adding new changes records IDs on auto_increment, so if those records have relationships that would be broken, or would need updates too. I'd go for UPDATE.

That's why you should prefer INSERT ... ON DUPLICATE KEY UPDATE instead of REPLACE.

The former one is an UPDATE operation in case of a key violation, while the latter one is DELETE / INSERT

UPDATE: Here's an example INSERT INTO table (a,b,c) VALUES (1,2,3) ON DUPLICATE KEY UPDATE c=c+1;

For more details read update documentation

like image 133
Srikar Appalaraju Avatar answered Sep 23 '22 00:09

Srikar Appalaraju


Philip, Have you tried doing prepared statements? With prepared statements you can batch one query with different parameters and call it multiple times. At the end of your loop, you can execute all of them with minimal amount of network latency. I have used prepared statements with php and it works great. Little more confusing than java prepared statements.

like image 39
Amir Raminfar Avatar answered Sep 26 '22 00:09

Amir Raminfar