We need to count the number of rows in a PostgreSQL table. In our case, no conditions need to be met, and it would be perfectly acceptable to get a row estimate if that significantly improved query speed.
Basically, we want select count(id) from <table>
to run as fast as possible, even if that implies not getting exact results.
So to make SELECT COUNT(*) queries fast, here's what to do:Get on any version that supports batch mode on columnstore indexes, and put a columnstore index on the table – although your experiences are going to vary dramatically depending on the kind of query you have.
By default, count query counts everything, including duplicates. Let's touch upon distinct , which is often used alongside count. This command uses an index-only scan but still takes around 3.5 seconds. The speed depends on many factors, including cardinality, size of the table, and whether the index is cached.
The basic SQL standard query to count the rows in a table is: SELECT count(*) FROM table_name; This can be rather slow because PostgreSQL has to check visibility for all rows, due to the MVCC model.
Some of the tricks we used to speed up SELECT-s in PostgreSQL: LEFT JOIN with redundant conditions, VALUES, extended statistics, primary key type conversion, CLUSTER, pg_hint_plan + bonus.
For a very quick estimate:
SELECT reltuples FROM pg_class WHERE relname = 'my_table';
There are several caveats, though. For one, relname
is not necessarily unique in pg_class
. There can be multiple tables with the same relname
in multiple schemas of the database. To be unambiguous:
SELECT reltuples::bigint FROM pg_class WHERE oid = 'my_schema.my_table'::regclass;
If you do not schema-qualify the table name, a cast to regclass
observes the current search_path
to pick the best match. And if the table does not exist (or cannot be seen) in any of the schemas in the search_path
you get an error message. See Object Identifier Types in the manual.
The cast to bigint
formats the real
number nicely, especially for big counts.
Also, reltuples
can be more or less out of date. There are ways to make up for this to some extent. See this later answer with new and improved options:
And a query on pg_stat_user_tables
is many times slower (though still much faster than full count), as that's a view on a couple of tables.
Count is slow for big tables, so you can get a close estimate this way:
SELECT reltuples::bigint AS estimate FROM pg_class WHERE relname='tableName';
and its extremely fast, results are not float, but still a close estimate.
reltuples
is a column from pg_class
table, it holds data about "number of rows in the table. This is only an estimate used by the planner. It is updated by VACUUM, ANALYZE, and a few DDL commands such as CREATE INDEX" (manual)pg_class
catalogs tables and most everything else that has columns or is otherwise similar to a table. This includes indexes (but see also pg_index), sequences, views, composite types, and some kinds of special relation (manual)If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With