I have a table of records that is populated sequentially once, but then every record is updated (the order in which they are updated and the timing of the updates are both random). The updates are not HOT updates. Is there any advantage to setting my fillfactor for this table to 50, or even less than 50, given these facts?
In the case of an ordinary update, PostgreSQL would write to at least 11 different pages: one page in the data heap and one page on every index. But in the case of a HOT update, PostgreSQL only writes to a single page. This saves 10 page writes. In this example, we update columns without an index on it.
pg_repack is a Postgres Pro Standard extension which lets you remove bloat from tables and indexes, and optionally restore the physical order of clustered indexes. Unlike CLUSTER and VACUUM FULL it works online, without holding an exclusive lock on the processed tables during processing.
FILLFACTOR. The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default. When a smaller fillfactor is specified, INSERT operations pack table pages only to the indicated percentage; the remaining space on each page is reserved for updating rows on that page.
Ok, as you mentioned in the comments to your question, you're making changes in your table using transactions updating 1-10k records in each transaction. This is right approach leaving some chances to autovacuum to make its work. But table's fillfactor
is not the first thing I'd check/change. Fillfactor can help you to speed up the process, but if autovacuum is not aggressive enough, you'll get very bloated table and bad performance soon.
So, first, I'd suggest you to control your table's bloating level. There is a number of queries which can help you:
Next, I'd tune autovacuum to much more aggressive state than default, like this (this is usually good idea even if you don't need to process whole table in short period of time), something like this:
log_autovacuum_min_duration = 0
autovacuum_vacuum_scale_factor = 0.01
autovacuum_analyze_scale_factor = 0.05
autovacuum_naptime = 60
autovacuum_vacuum_cost_delay = 20
After some significant number of transactions with UPDATEs, check the bloating level.
Finally, yes, I'd tune fillfactor but probably to some higher (and more usual) value like 80 or 90 – here you need to make some predictions, what is the probability that 10% or more tuples inside a page will be updated by the single transaction? If the chances are very high, reduce fillfactor. But you've mentioned that order of rows in UPDATEs is random, so I'd use 80-90%. Keep in mind that there is an obvious trade-off here: if you set fillfactor to 50, your table will need 2x more disk space and all operations will naturally become slower. If you want to go deep to this question, I suggest creating 21 tables with fillfactors 50..100 with the same data and testing UPDATE TPS with pgbench.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With