Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to tell when a Postgres table was clustered and what indexes were used

I've been impressed by the performance improvements achieved with clustering, but not with how long it takes.

I know clustering needs to be rebuilt if a table or partition is changed after the clustering, but unless I've made a note of when I last clustered a table, how can I tell when I need to do it again?

I can use this query to tell me what table(s) have one or more clustered indexes

SELECT *
FROM   pg_class c
JOIN   pg_index i ON i.indrelid = c.oid
WHERE  relkind = 'r' AND relhasindex AND i.indisclustered 

My questions are.

  • How can I tell which indexes have been clustered?
  • Is there any way of finding out exactly when a table was last clustered?
  • How can I tell if a clustered index is still 'valid', or in other words, how can tell how much a table/index has changed enough that I need to re-build the cluster.

I've noticed that it takes just as long to re-build a clustered index as it does to build it in the first place (even if the table hasn't been touched in the meantime). So I want to avoid re-clustering unless I know the table needs it.


UPDATE for clarity (I hope)

If I use this command....

CLUSTER tableA USING tableA_idx1;
  • How can I find out at a later date which index was referenced i.e. tableA_idx1 (the table has multiple indexes defined)?
  • Is it recorded anywhere when this command was run?
  • I know that the cluster may need to be rebuilt/refreshed/recreated (not sure of the correct phraseology) occasionally using CLUSTER tableA when the table undergoes changes. Is there anyway of knowing when the table has changed so much that the clustering no longer helps?
like image 750
ConanTheGerbil Avatar asked Nov 14 '18 11:11

ConanTheGerbil


People also ask

Does PostgreSQL use clustered indexes?

PostgreSQL provides clustered index functionality to the user in which every table of the database has a unique clustered index. Clustered index means it stores another value of table on secondary storage. Clustered index is used to uniquely identify rows from a table.

Does Postgres create clustered index on primary key?

PostgreSQL automatically creates indexes for PRIMARY KEY and every UNIQUE constraints of a table. Login to a database in PostgreSQL terminal and type \d table_name . All stored indexes will be visualized. If there is a clustered index then it will also be identified.

What is clustered index and non clustered index in PostgreSQL?

In Non-Clustered index, index key defines order of data within index. A Clustered index is a type of index in which table records are physically reordered to match the index. A Non-Clustered index is a special type of index in which logical order of index does not match physical stored order of the rows on disk.

How do I find my Postgres cluster name?

The cluster name appears in the process title for all server processes in this cluster. Moreover, it is the default application name for a standby connection (see synchronous_standby_names.) The name can be any string of less than NAMEDATALEN characters (64 characters in a standard build).


1 Answers

To tell which index was last used to cluster the table, use the pg_index system catalog.

Query the table for all indexes that belong to your table and see which one has indisclustered set. A table can only be clustered by a single index at a time.

There is no way to find out when the table was last clustered, but that's not very interesting anyway. What you want to know is how good the clustering still is.

To find that, query the pg_stats line for the column on which you clustered. If correlation is close to 1, you are still good. The smaller the value gets, the more clustering is indicated.

like image 90
Laurenz Albe Avatar answered Oct 07 '22 01:10

Laurenz Albe