Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deleting many rows without locking them

In PostgreSQL I have a query like the following which will delete 250k rows from a 1m row table:

DELETE FROM table WHERE key = 'needle';

The query takes over an hour to execute and during that time, the affected rows are locked for writing. That is not good because it means that a lot of update queries have to wait for the big delete query to complete (and then they will fail because the rows disappeared from under them but that is ok). I need a way to segment this big query into multiple parts so that they will cause the least interference with the update queries as possible. For example, if the delete query could be split up into chunks each with 1000 rows in them then the other update queries would at most have to wait for a delete query involving 1000 rows.

DELETE FROM table WHERE key = 'needle' LIMIT 10000;

That query would work nicely, but alas it does not exist in postgres.

like image 703
Björn Lindqvist Avatar asked Aug 06 '10 05:08

Björn Lindqvist


3 Answers

Try a subselect and use a unique condition:

DELETE FROM 
  table 
WHERE 
  id IN (SELECT id FROM table WHERE key = 'needle' LIMIT 10000);
like image 195
Frank Heikens Avatar answered Oct 18 '22 16:10

Frank Heikens


Frak's answer is good, but this can be faster, but requires 8.4 because of window functions support (pseudocode):

result = query('select
    id from (
        select id, row_number(*) over (order by id) as row_number
        from mytable where key=?
    ) as _
    where row_number%8192=0 order by id, 'needle');
// result contains ids of every 8192nd row which key='needle'
last_id = 0;
result.append(MAX_INT); // guard
for (row in result) {
    query('delete from mytable
        where id<=? and id>? and key=?, row.id, last_id, 'needle');
    // last_id is used to hint query planner,
    // that there will be no rows with smaller id
    // so it is less likely to use full table scan
    last_id = row.id;
}

This is premature optimization — evil thing. Beware.

like image 45
Tometzky Avatar answered Oct 18 '22 18:10

Tometzky


set the lock level for your delete and updates to a more granular lock mode. note that your transactions will be now be slower.

http://www.postgresql.org/docs/current/static/sql-lock.html

http://www.postgresql.org/docs/current/static/explicit-locking.html

like image 37
potatopeelings Avatar answered Oct 18 '22 18:10

potatopeelings