Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deleting millions of rows in MySQL

I recently found and fixed a bug in a site I was working on that resulted in millions of duplicate rows of data in a table that will be quite large even without them (still in the millions). I can easily find these duplicate rows and can run a single delete query to kill them all. The problem is that trying to delete this many rows in one shot locks up the table for a long time, which I would like to avoid if possible. The only ways I can see to get rid of these rows, without taking down the site (by locking up the table) are:

  1. Write a script that will execute thousands of smaller delete queries in a loop. This will theoretically get around the locked table issue because other queries will be able to make it into the queue and run in between the deletes. But it will still spike the load on the database quite a bit and will take a long time to run.
  2. Rename the table and recreate the existing table (it'll now be empty). Then do my cleanup on the renamed table. Rename the new table, name the old one back and merge the new rows into the renamed table. This is way takes considerably more steps, but should get the job done with minimal interruption. The only tricky part here is that the table in question is a reporting table, so once it's renamed out of the way and the empty one put in its place all historic reports go away until I put it back in place. Plus the merging process could be a bit of a pain because of the type of data being stored. Overall this is my likely choice right now.

I was just wondering if anyone else has had this problem before and, if so, how you dealt with it without taking down the site and, hopefully, with minimal if any interruption to the users? If I go with number 2, or a different, similar, approach, I can schedule the stuff to run late at night and do the merge early the next morning and just let the users know ahead of time, so that's not a huge deal. I'm just looking to see if anyone has any ideas for a better, or easier, way to do the cleanup.

like image 988
Steven Surowiec Avatar asked Aug 23 '09 16:08

Steven Surowiec


People also ask

How do I delete a lot of rows in MySQL?

We can use DELETE statement along with a WHERE clause, which identifies those multiple rows, to delete multiple rows from MySQL table.

Can MySQL handle 1 million records?

The MySQL maximum row size limit of 65,535 bytes is demonstrated in the following InnoDB and MyISAM examples. The limit is enforced regardless of storage engine, even though the storage engine may be capable of supporting larger rows.


1 Answers

DELETE FROM `table` WHERE (whatever criteria) ORDER BY `id` LIMIT 1000 

Wash, rinse, repeat until zero rows affected. Maybe in a script that sleeps for a second or three between iterations.

like image 102
chaos Avatar answered Sep 23 '22 14:09

chaos