Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres 8.4.4 (x32 on Win7 x64) very slow UPDATE on small table

I have a very simple update statement:

UPDATE W SET state='thing'
WHERE state NOT IN ('this','that') AND losttime < CURRENT_TIMESTAMP;

The table W only has 90 rows, though the losttime and state columns for each row are updated each about every ~10s seconds. There are indexes on state and losttime (as well as the primary index).

I'm noticing with large databases (i.e. the other tables have a lot of entries, not table W) over a period of time, the query gets slower and slower and slower. After running for 48 hours, I'm timing it by running it in the query window of PqAdminIII and it's taken 17 minutes to execute!

I have a similar query on another table that's showing the same problem:

UPDATE H SET release='1' 
WHERE a NOT IN (SELECT id from A WHERE state!='done') AND release!='1';

H does not have any indexes, but I have tried putting and removing an index on H(release) with no change in behaviour. This query, after the database has been up for 48 hours and the table H has ~100k rows, is taking 27 minutes. Postgres server will have a thread completely pegged (100% CPU utilization) for the duration of the query, so it doesn't look like there's any contention for network, disk, etc.

So in broad strokes, the behaviour I see is that my database runs as expected for about 5 minutes, then gradually everything grinds to a halt as basic maintenance-related UPDATE commands start to take longer and longer to run. By the second day, it's taking an hour to do a simple maintenance cycle (a handful of UPDATES) which were running ~100ms at the outset. It seems clear to me that the performance degrade is super-linear with the amount of information in the database -- maybe N^2 or some such.

Autovacuum is on using the defaults. I read through the manual (again) and didn't see anything that jumped out at me.

I'm scratching my head here. I don't see any bug fixes that seem relevant in 9.0.1 and 9.0.2 release notes. Can anyone help me to understand what is happening? Thanks, M

-x-x-x-x-

Okay, so I may have two problems here.

The first update appears to run fast now. Not sure what happened, so I'll proceed there with the assumption that I need to run VACUUM / ANALYZE or some combination more frequently -- say every minute or so. I would really like to know why autovacuum isn't doing this for me.

The second update continues to run slowly. The query plan suggests that indexes are not being used effectively and that there is a 80k*30k cross occurring, which could account for super-linear runtime that I seem to be observing. (Does everyone agree with this interpretation of the plan?)

I can convert the UPDATE to a SELECT:

SELECT * from H
where a not in (SELECT id from A where state='done') AND release!='1';

with a similar runtime (27 minutes).

If I don't trust the postgres optimizer and do this:

WITH r as (select id from A where state='done')
SELECT a from H 
JOIN on H.a=r.id 
WHERE H.released='0';

then the query runs in ~500ms.

How do I translate this knowledge back into an UPDATE that runs with acceptable speed? My attempt:

UPDATE H SET release='1'
FROM A
where A.state!='done' AND release!='1' AND A.id=H.a;

runs in about 140 seconds, which is faster, but still very very slow.

Where can I go from here?

-x-x-x-x-

VACUUM ANALYZE has been added as part of "routine maintenance" where the application will run it approximately once every minute or so independently of any autovacuum that is running.

Also, rewrote the second query to eliminate the known-to-be-slow NOT IN clause, replacing it with a "Left Anti-Semi Join" (huh?)

UPDATE H SET release='1' 
WHERE release='0' AND NOT EXISTS (SELECT * FROM A WHERE id=H.a AND state!='done');
like image 270
Mayur Patel Avatar asked Dec 24 '10 18:12

Mayur Patel


2 Answers

PostgreSQL implements MVCC.

This means that each time you make an update, a new copy of row is created and the old one is marked as deleted (but is not physically deleted).

This slows down the queries.

You should run VACUUM on a timely basis.

PostgreSQL 8.4.4 runs autovacuum daemon to do this, but it may have some problems on your installation.

Does the situation improve when you run VACUUM manually?

like image 82
Quassnoi Avatar answered Sep 24 '22 02:09

Quassnoi


Check with pg_total_relation_size('tablename') whether your tables are bloated out of proportion. If that is the case, you may need to tweak the autovacuum configuration.

The other alternative is that the tables are locked. Look into pg_stat_activity or pg_locks to find out.

like image 27
Peter Eisentraut Avatar answered Sep 26 '22 02:09

Peter Eisentraut