Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are my COMMITs after SELECT-only transactions slow with PostgreSQL?

Tags:

postgresql

I have a web app presenting data that's being generated by a separate process and stored in PostgreSQL, version 8.4. The backend is writing fairly continuously, but the majority of views in the web app execute nothing but read-only SELECT queries.

According to the New Relic Python agent, 30% of my view processing time is spent waiting for COMMITs to complete, and it's particularly bad in views that issued a lot of SELECT queries, even if they didn't modify any data.

I expected a transaction that had been read-only to have very little work to do during the commit phase. What is Postgres doing during COMMIT on these read-only queries?

I know I could turn off synchronous_commit for these transactions to hide the latency from the view, and I certainly don't care about durability for a read-only transaction, but I don't see why it should be necessary, and I'm concerned that doing so might mask a deeper misconfiguration.

like image 843
Jamey Sharp Avatar asked Aug 09 '13 20:08

Jamey Sharp


2 Answers

There are various clean up operations that need to be done to keep the database in good shape, and many of these are done by the first process that stumbles upon the opportunity, even if that process is only doing select queries.

These clean up operations can generate WAL records, which then trigger syncs upon commit. So while the selects might be read-only on a user-visible level, behind the scenes they really are doing writes.

It should be possible to detect when all of the WAL operations done in a given transaction are from clean-up operations, and then automatically do the commit asynchronously in those cases. But no one has gotten around to implementing this feature yet (or even cataloging all of the WAL call-sites which are in this category).

like image 114
jjanes Avatar answered Sep 22 '22 22:09

jjanes


Comments are too short so here goes. Capture some logs while you run code and take the guess work out of the equation.

Update postgresql.conf to have these settings. You will need to restart postgre to get logging_collector to take. You can and should remove these settings after you are done. So make sure to backup postgresql.conf before making any changes. Once you have a log file with the captured data I recommend using this to look at it if the log is more than a page or so http://dalibo.github.io/pgbadger/.

log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d.log'
log_rotation_age = 0
client_min_messages = notice
log_min_messages = warning
log_min_error_statement = error
log_min_duration_statement = 0
log_checkpoints = on
log_connections = on
log_disconnections = on
log_duration = off
log_error_verbosity = verbose
log_hostname = on
log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u '
log_lock_waits = on
log_statement = 'none'
log_temp_files = 0
like image 34
Kuberchaun Avatar answered Sep 20 '22 22:09

Kuberchaun