Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres performance issues

We are running Postgres 9.1.3 and we have recently started to run into major performance problems on one of our servers.

Our queries ran fine for a while, but as of August 1st, they have slowed down dramatically. It would appear that most of the problematic queries are Select queries (queries with count(*) are especially bad), but in general, the database is just running really slow.

We ran this query on the server and these were the changes that we have made to the default config file (Note: The server ran fine with these changes before, so, they likely don't matter much) :

       name            |                                                current_setting
---------------------------+---------------------------------------------------------------------------------------------------------------
version                   | PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by  gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-51), 64-bit
autovacuum                | off
bgwriter_delay            | 20ms
checkpoint_segments       | 6
checkpoint_warning        | 0
client_encoding           | UTF8
default_statistics_target | 1000
effective_cache_size      | 4778MB
effective_io_concurrency  | 2
fsync                     | off
full_page_writes          | off
lc_collate                | en_US.UTF-8
lc_ctype                  | en_US.UTF-8
listen_addresses          | *
maintenance_work_mem      | 1GB
max_connections           | 100
max_stack_depth           | 2MB
port                      | 5432
random_page_cost          | 2
server_encoding           | UTF8
shared_buffers            | 1792MB
synchronous_commit        | off
temp_buffers              | 16MB
TimeZone                  | US/Eastern
wal_buffers               | 16MB
wal_level                 | minimal
wal_writer_delay          | 10ms
work_mem                  | 16MB
(28 rows)

Time: 210.231 ms

Normally, when problems like this arise, the first thing people recommend is vacuuming and we have tried that. We vacuum analyzed most of the database, but it didn't help.

We used Explain on some of our queries and noticed that Postgres was resorting to sequential scans even though the tables had indexes.

We turned sequential scan off to force the query planner into using indexes, but that did not help either.

We then tried out this query to see if we had a lot of unused diskspace that Postgres was going through in order to find what it is looking for. Unfortunately, while some of our tables did have a bit of bulk, it did not seem significant enough to slow down overall system performance.

We think the slowdown might be I/O related, but we can't figure out the specifics. Is Postgres just being silly and if so, what part of it? Is there something wrong with the VM, or perhaps something wrong with the physical hardware itself?

Do you guys have any other suggestions for things that we can try or check out?

EDIT:

I am so sorry for not updating this sooner. I got caught up in other things.

On this particular machine, our performance greatly improved by making one small modification to the Virtual Machine's settings.

There is a setting that deals with IO caching. It was originally set to to ON. We figured that constantly caching things was slowing things down and we were right. We turned it OFF, and things improved drastically.

Interestingly enough most of our other servers already had this setting turned off.

There are other issues, and I am sure we will take a lot of your suggestions, so, thanks a lot for helping.

like image 721
zermy Avatar asked Aug 13 '12 18:08

zermy


1 Answers

Your biggest problem is this line:

autovacuum                | off

Turning it on won't immediately cure the problem, but it should keep things from eroding further. There are almost no cases where it is a good idea to turn this off. The main exception is a big bulk load followed by an explicit VACUUM FREEZE ANALYZE, after which autovacuum should be turned back on. With autovacuum off, you will see performance degrade, just as you have. Once the database has gotten into such bad shape, it requires more aggressive maintenance than autovacuum can provide to recover.

checkpoint_segments       | 6

Increasing this will help data modifications, but won't do much to improve the speed of SELECT statements.

fsync                     | off
full_page_writes          | off

These settings tell PostgreSQL to speed up writes at the expense of persistence. If your hardware or OS (or VM) crashes or is abruptly killed, your database will be corrupted and your best bet will be to restore from your last known good backup. (Of course, since hardware can fail at any time, if you care about losing the data, you have a good backup strategy in place.)

maintenance_work_mem      | 1GB

This is too high for an 8GB VM. You can always boost it on a single connection before running some heavy maintenance on that connection.

wal_writer_delay          | 10ms

Even seasoned experts have trouble adjusting this to something that gets better performance than the default. It is almost always best left alone.

Your best bet at this point is to use pg_dumpall to dump your database cluster to some other medium, start with a fresh initdb, and restore. As a database superuser, run VACUUM FREEZE ANALYZE (the FREEZE is not generally recommended except after a bulk load like that), and run with autovacuum turned on.

I highly recommend that you get a copy of Greg Smith's "PostgreSQL 9.0 High Performance" book, and read it carefully. (Full disclosure, I was one of the technical reviewers for the book, but get no money from sales.) One of the first things he recommends is getting benchmark numbers on the speed of your RAM and disk before you even install PostgreSQL -- that way you know what you're dealing with.

like image 71
kgrittn Avatar answered Sep 24 '22 00:09

kgrittn