Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres 9 super slow simple delete

I have a substantial database... not a very large one - around 1gb of data in total.

I need to delete some rows from several tables. For example I have a table

 Order
 id     | ... | status   | ...
 1      | ... | 1        | ...
 ...
 40     | ... | 20       | ...
 41     | ... | 1        | ...
 ...
 470000 | ... | 12       | ...

Now I want to delete all orders that have status=1

I presume I do it with:

DELETE FROM Order WHERE status=1

All nice and simple it would seem but it takes ages! When I have run this query it was still running at 100% CPU usage after 40 min... when I killed the process nothing was deleted.

When I tried limiting the scope by using

DELETE FROM Order WHERE status=1 AND id<1000

it took couple of minutes to delete some 200 rows....

Is there anything I am missing in my configuration? Anything I should look for/check/change? Any ideas at all why its so bloody inefficient?

Let me add that I am usually working with MySQL and need to manage this postgres database but don't really have any experience with postgres, so it may be something very simple.

Indexes are on both id and status columns.

Table has some 500k rows, around half needs to be deleted.

Execution plan:

Delete  (cost=0.00..19474.19 rows=266518 width=6)
->  Seq Scan on Orders  (cost=0.00..19474.19 rows=266518 width=6)
Filter: (statusid = 1)

There are no triggers or rules of any sort. What's more, I didn't add this is a fresh copy of the table, I mean it was moved from other server with export/import. Perhaps this plays a role somehow?

Will deleting indexes help?

like image 278
RandomWhiteTrash Avatar asked Oct 24 '11 08:10

RandomWhiteTrash


People also ask

Why is Postgres slow to delete?

Slow DELETEs are almost always caused by missing indexes on foreign keys that directly or indirectly reference the table being deleted from.

How do I quickly delete PostgreSQL?

TRUNCATE provides a faster mechanism to remove all rows from a table. There are two ways to delete rows in a table using information contained in other tables in the database: using sub-selects, or specifying additional tables in the USING clause.

Do indexes slow down deletes?

If you update a table, the system has to maintain those indexes that are on the columns being updated. So having a lot of indexes can speed up select statements, but slow down inserts, updates, and deletes.


2 Answers

Having nothing deleted after you killed the process is EXACTLY what you should see.

The delete happens as a transaction, meaning that either everything is deleted, or nothing is. To make sure that this can happen, the rows need to be copied somewhere before deleting them. Which means that deleting 250k rows takes about as much time as inserting that many rows. In some cases it can be faster to create a new table with everything NOT deleted and rename the new table to the old one.

If this is being moved from another db, you will probably be better off if you can keep the 250k rows from being inserted in the first place.

(This is general RDBMS wisdom, not postgresql specific - the details of how postgres MVCC works may differ significantly.)

like image 107
evil otto Avatar answered Oct 26 '22 03:10

evil otto


It is much quicker use COPY FROM / TURNCATE /COPY TO on big tables.

But, of course, you have to be careful with references and if possible, disable triggers.

like image 37
Petr Avatar answered Oct 26 '22 03:10

Petr