In pgAdmin, whenever a table's statistics are out-of-date, it prompts:
Running VACUUM recommended
The estimated rowcount on the table schema.table deviates significantly from the actual rowcount. You should run VACUUM ANALYZE on this table.
I've tested it using pgAdmin 3 and Postgres 8.4.4, with autovacuum=off. The prompt shows up immediately whenever I click a table that has been changed.
Let's say I'm making a web-based system in Java, how do I detect if a table is out-of-date, so that I can show a prompt like the one in pgAdmin?
Because of the nature of my application, here are a few rules I have to follow:
I want to know if the statistics of a certain table in pg_stats and pg_statistic are up to date.
I can't set the autovacuum flag in postgresql.conf. (In other words, the autovacuum flag can be on or off. I have no control over it. I need to tell if the stats are up-to-date whether the autovacuum flag is on or off.)
I can't run vacuum/analyze every time to make it up-to-date.
When a user selects a table, I need to show the prompt that the table is outdated when there are any updates to this table (such as drop, insert, and update) that are not reflected in pg_stats and pg_statistic.
It seems that it's not feasible by analyzing timestamps in pg_catalog.pg_stat_all_tables. Of course, if a table hasn't been analyzed before, I can check if it has a timestamp in last_analyze to find out whether the table is up-to-date. Using this method, however, I can't detect if the table is up-to-date when there's already a timestamp. In other words, no matter how many rows I add to the table, its last_analyze timestamp in pg_stat_all_tables is always for the first analyze (assuming the autovacuum flag is off). Therefore, I can only show the "Running VACUUM recommended" prompt for the first time.
It's also not feasible by comparing the last_analyze timestamp to the current timestamp. There might not be any updates to the table for days. And there might be tons of updates in one hour.
Given this scenario, how can I always tell if the statistics of a table are up-to-date?
By default, PostgreSQL is configured with an AUTOVACUUM daemon which automates the run of statistics collection by using the ANALYZE commands (in addition to automation of the VACUUM command). The AUTOVACUUM daemon scans for tables that show signs of large modifications in data to collect the current statistics.
Utilizing stats tables in PostgreSQL, you can monitor the number of live and dead rows, also referred to as tuples, in the table. Live rows are the rows in your table that are currently in use and can be queried in Chartio to reference and analyze data.
Summary. Use \l or \l+ in psql to show all databases in the current PostgreSQL server. Use the SELECT statement to query data from the pg_database to get all databases.
With the help of ANALYZE command, you can get the stats of all the Postgres databases. For this, log in to your Postgres console and then type ANALYZE to execute the command. Upon successful execution of the command, it would return ANALYZE.
Check the system catalogs.
test=# SELECT schemaname, relname, last_analyze FROM pg_stat_all_tables WHERE relname = 'city'; schemaname | relname | last_analyze ------------+---------+------------------------------- pagila | city | 2011-07-26 19:30:59.357898-07 world | city | 2011-07-26 19:30:53.119366-07 (2 rows)
All kinds of useful information in there:
test=# \d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables" Column | Type | Modifiers -------------------+--------------------------+----------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamp with time zone | last_autovacuum | timestamp with time zone | last_analyze | timestamp with time zone | last_autoanalyze | timestamp with time zone | vacuum_count | bigint | autovacuum_count | bigint | analyze_count | bigint | autoanalyze_count | bigint |
You should not have to worry about vac'ing in your application. Instead, you should have the autovac
process configured on your server (in postgresql.conf
), and the server takes takes of VACCUM
and ANALYZE
processes based on its own internal statistics. You can configure how often it should run, and what the threshold variables are for it to process.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With