Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgresql large table update slows down

I run an update on a large table (e.g. 8 GB). It is a simple update of 3 fields in the table. I had no problems running it under postgresql 9.1, it would take 40-60 minutes but it worked. I run the same query in 9.4 database (freshly created, not upgraded) and it starts the update fine but then slows down. It uses only ~2% CPU, the level if IO is 4-5MB/s and it is sitting there. No locks, no other queries or connections, just this single update SQL on the server.

The SQL is below. "lookup" table has 12 records. The lookup can return only one row, it breaks a discrete scale (SMALLINT, -32768 .. +32767) into non-overlapping regions. "src" and "dest" tables are ~60 million records.

UPDATE dest SET
    field1 = src.field1,
    field2 = src.field2,
    field3_id = (SELECT lookup.id FROM lookup WHERE src.value BETWEEN lookup.min AND lookup.max)
FROM src
WHERE dest.id = src.id;

I thought my disk slowed down but I can copy 1 GB files in parallel to query execution and it runs fast at >40MB/s and I have only one disk (it is a VM with ISCSI media). All other disk operations are not impacted, there is plenty of IO bandwidth. At the same time PostgreSQL is just sitting there doing very little, running very slowly.

I have 2 virtualized linux servers, one runs postgresql 9.1 and another runs 9.4. Both servers have close to identical postgresql configuration.

Has anyone else had similar experience? I am running out of ideas. Help.

Edit The query "ran" for 20 hours I had to kill the connections and restart the server. Surprisingly it didn't kill the connection via query:

SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE pid <> pg_backend_pid() AND datname = current_database();

and sever produced the following log:

2015-05-21 12:41:53.412 EDT FATAL:  terminating connection due to administrator command
2015-05-21 12:41:53.438 EDT FATAL:  terminating connection due to administrator command
2015-05-21 12:41:53.438 EDT STATEMENT:  UPDATE <... this is 60,000,000 record table update statement>

Also server restart took long time, producing the following log:

2015-05-21 12:43:36.730 EDT LOG:  received fast shutdown request
2015-05-21 12:43:36.730 EDT LOG:  aborting any active transactions
2015-05-21 12:43:36.730 EDT FATAL:  terminating connection due to administrator command
2015-05-21 12:43:36.734 EDT FATAL:  terminating connection due to administrator command
2015-05-21 12:43:36.747 EDT LOG:  autovacuum launcher shutting down
2015-05-21 12:44:36.801 EDT LOG:  received immediate shutdown request
2015-05-21 12:44:36.815 EDT WARNING:  terminating connection because of crash of another server process
2015-05-21 12:44:36.815 EDT DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

"The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory" - is this an indication of a bug in PostgreSQL?

Edit I tested 9.1, 9.3 and 9.4. Both 9.1 and 9.3 don't experience the slow down. 9.4 consistently slows down on large transactions. I noticed that when a transaction starts htop monitor indicates high CPU and the process status is "R" (running). Then it gradually changes to low CPU usage and status "D" - disk (see screenshot Disk waiting). My biggest question is why 9.4 is different from 9.1 and 9.3? I have a dozen of servers and this effect is observed across the board.

like image 299
oᴉɹǝɥɔ Avatar asked Oct 21 '25 07:10

oᴉɹǝɥɔ


1 Answers

Thanks everyone for the help. No matter how much I tried to emphasize on the difference of performance between identical configuration of 9.4 and previous versions no one seemed to pay attention to that.

The problem was solved by disabling transparent huge pages:

echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

Here are some resources I found helpful in reserching the issue:
* https://dba.stackexchange.com/questions/32890/postgresql-pg-stat-activity-shows-commit/34169#34169
* https://lwn.net/Articles/591723/
* https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge

like image 181
oᴉɹǝɥɔ Avatar answered Oct 23 '25 21:10

oᴉɹǝɥɔ



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!