The SQL COUNT() function returns the number of rows in a table satisfying the criteria specified in the WHERE clause. It sets the number of rows or non NULL column values. COUNT() returns 0 if there were no matching rows.
To counts all of the rows in a table, whether they contain NULL values or not, use COUNT(*). That form of the COUNT() function basically returns the number of rows in a result set returned by a SELECT statement.
The PostgreSQL COUNT function counts a number of rows or non-NULL values against a specific column from a table. When an asterisk(*) is used with count function the total number of rows returns. Syntax: COUNT (* | [DISTINCT] ALL | column_name)
Counting rows in big tables is known to be slow in PostgreSQL. The MVCC model requires a full count of live rows for a precise number. There are workarounds to speed this up dramatically if the count does not have to be exact like it seems to be in your case.
(Remember that even an "exact" count is potentially dead on arrival under concurrent write load.)
Slow for big tables.
With concurrent write operations, it may be outdated the moment you get it.
SELECT count(*) AS exact_count FROM myschema.mytable;
Extremely fast:
SELECT reltuples AS estimate FROM pg_class where relname = 'mytable';
Typically, the estimate is very close. How close, depends on whether ANALYZE
or VACUUM
are run enough - where "enough" is defined by the level of write activity to your table.
The above ignores the possibility of multiple tables with the same name in one database - in different schemas. To account for that:
SELECT c.reltuples::bigint AS estimate
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = 'mytable'
AND n.nspname = 'myschema';
The cast to bigint
formats the real
number nicely, especially for big counts.
SELECT reltuples::bigint AS estimate
FROM pg_class
WHERE oid = 'myschema.mytable'::regclass;
Faster, simpler, safer, more elegant. See the manual on Object Identifier Types.
Replace 'myschema.mytable'::regclass
with to_regclass('myschema.mytable')
in Postgres 9.4+ to get nothing instead of an exception for invalid table names. See:
We can do what the Postgres planner does. Quoting the Row Estimation Examples in the manual:
These numbers are current as of the last
VACUUM
orANALYZE
on the table. The planner then fetches the actual current number of pages in the table (this is a cheap operation, not requiring a table scan). If that is different fromrelpages
thenreltuples
is scaled accordingly to arrive at a current number-of-rows estimate.
Postgres uses estimate_rel_size
defined in src/backend/utils/adt/plancat.c
, which also covers the corner case of no data in pg_class
because the relation was never vacuumed. We can do something similar in SQL:
SELECT (reltuples / relpages * (pg_relation_size(oid) / 8192))::bigint
FROM pg_class
WHERE oid = 'mytable'::regclass; -- your table here
SELECT (CASE WHEN c.reltuples < 0 THEN NULL -- never vacuumed
WHEN c.relpages = 0 THEN float8 '0' -- empty table
ELSE c.reltuples / c.relpages END
* (pg_relation_size(c.oid) / pg_catalog.current_setting('block_size')::int)
)::bigint
FROM pg_class c
WHERE c.oid = 'myschema.mytable'::regclass; -- schema-qualified table here
Doesn't break with empty tables and tables that have never seen VACUUM
or ANALYZE
. The manual on pg_class
:
If the table has never yet been vacuumed or analyzed,
reltuples
contains-1
indicating that the row count is unknown.
If this query returns NULL
, run ANALYZE
or VACUUM
for the table and repeat. (Alternatively, you could estimate row width based on column types like Postgres does, but that's tedious and error-prone.)
If this query returns 0
, the table seems to be empty. But I would ANALYZE
to make sure. (And maybe check your autovacuum
settings.)
Typically, block_size
is 8192. current_setting('block_size')::int
covers rare exceptions.
Table and schema qualifications make it immune to any search_path
and scope.
Either way, the query consistently takes < 0.1 ms for me.
More Web resources:
TABLESAMPLE SYSTEM (n)
in Postgres 9.5+SELECT 100 * count(*) AS estimate FROM mytable TABLESAMPLE SYSTEM (1);
Like @a_horse commented, the added clause for the SELECT
command can be useful if statistics in pg_class
are not current enough for some reason. For example:
autovacuum
running.INSERT
/ UPDATE
/ DELETE
.TEMPORARY
tables (which are not covered by autovacuum
).This only looks at a random n % (1
in the example) selection of blocks and counts rows in it. A bigger sample increases the cost and reduces the error, your pick. Accuracy depends on more factors:
FILLFACTOR
occupy space per block. If unevenly distributed across the table, the estimate may be off.Typically, the estimate from pg_class
will be faster and more accurate.
First, I need to know the number of rows in that table, if the total count is greater than some predefined constant,
And whether it ...
... is possible at the moment the count pass my constant value, it will stop the counting (and not wait to finish the counting to inform the row count is greater).
Yes. You can use a subquery with LIMIT
:
SELECT count(*) FROM (SELECT 1 FROM token LIMIT 500000) t;
Postgres actually stops counting beyond the given limit, you get an exact and current count for up to n rows (500000 in the example), and n otherwise. Not nearly as fast as the estimate in pg_class
, though.
I did this once in a postgres app by running:
EXPLAIN SELECT * FROM foo;
Then examining the output with a regex, or similar logic. For a simple SELECT *, the first line of output should look something like this:
Seq Scan on uids (cost=0.00..1.21 rows=8 width=75)
You can use the rows=(\d+)
value as a rough estimate of the number of rows that would be returned, then only do the actual SELECT COUNT(*)
if the estimate is, say, less than 1.5x your threshold (or whatever number you deem makes sense for your application).
Depending on the complexity of your query, this number may become less and less accurate. In fact, in my application, as we added joins and complex conditions, it became so inaccurate it was completely worthless, even to know how within a power of 100 how many rows we'd have returned, so we had to abandon that strategy.
But if your query is simple enough that Pg can predict within some reasonable margin of error how many rows it will return, it may work for you.
Reference taken from this Blog.
You can use below to query to find row count.
Using pg_class:
SELECT reltuples::bigint AS EstimatedCount
FROM pg_class
WHERE oid = 'public.TableName'::regclass;
Using pg_stat_user_tables:
SELECT
schemaname
,relname
,n_live_tup AS EstimatedCount
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;
You can also just SELECT MAX(id) FROM <table_name>
; change id to whatever the PK of the table is
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With