Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgresql select count query takes long time

I have a table named events in my Postgresql 9.5 database. And this table has about 6 million records.

I am runnig a select count(event_id) from events query. But this query takes 40seconds. This is very long time for a database. My event_id field of table is primary key and indexed. Why this takes very long time? (Server is ubuntu vm on vmware has 4cpu)

Explain:

"Aggregate  (cost=826305.19..826305.20 rows=1 width=0) (actual time=24739.306..24739.306 rows=1 loops=1)"
"  Buffers: shared hit=13 read=757739 dirtied=53 written=48"
"  ->  Seq Scan on event_source  (cost=0.00..812594.55 rows=5484255 width=0) (actual time=0.014..24087.050 rows=6320689 loops=1)"
"        Buffers: shared hit=13 read=757739 dirtied=53 written=48"
"Planning time: 0.369 ms"
"Execution time: 24739.364 ms"
like image 249
barteloma Avatar asked Mar 06 '19 08:03

barteloma


People also ask

How can I make count queries faster?

So to make SELECT COUNT(*) queries fast, here's what to do:Get on any version that supports batch mode on columnstore indexes, and put a columnstore index on the table – although your experiences are going to vary dramatically depending on the kind of query you have.

How make PostgreSQL query run faster?

Some of the tricks we used to speed up SELECT-s in PostgreSQL: LEFT JOIN with redundant conditions, VALUES, extended statistics, primary key type conversion, CLUSTER, pg_hint_plan + bonus.

What is count (*) in PostgreSQL?

1) COUNT(*) You can use the PostgreSQL COUNT(*) function along with a SELECT statement to return the total number of rows in a table including the NULL values as well as the duplicates.

Is count faster than SELECT MySQL?

It's much faster to run the count in SQL. Apart from anything else, you don't need to send the entire table over the wire, just one integer value!


2 Answers

I know that this is an old question and the existing answer covers the vast majority of info around this, but I just ran into a situation where a table of 1.3 million rows was taking about 35 seconds to perform a simple SELECT COUNT(*). None of the other solutions helped. The issue ended up being that the table was just bloated and hadn't been vacuumed, so Postgres couldn't figure out the most optimal way to query the data. After I ran this, the query time dropped down to about 25ms!

VACUUM (ANALYZE, VERBOSE, FULL) my_table_name;

Hope this helps someone else!

like image 132
Alec Sanger Avatar answered Oct 24 '22 09:10

Alec Sanger


There are multiple factors playing a big role in the decision for PostgreSQL how to execute the count(), but first of all, the column you use inside the count function does not matter. In fact, if you don't need DISTINCT count, stick with count(*).

You can try the following to force an index-only scan:

SELECT count(*) FROM (SELECT event_id FROM events) t;

...if that still results in a sequential scan, than most likely the index is not much smaller than the table itself. To still see how an index-only scan would perform, you can enforce it with:

SELECT count(*) FROM (SELECT event_id FROM events ORDER BY 1) t;

IF that is not much faster, you should also consider an upgrade of the PostgreSQL to at least version 9.6, which introduces parallel sequential scans to speed up these things.

In addition, you can achieve dramatic speedups choosing from a variety of techniques to provide counts which largely depend on your use-case and your requirements:

  • Faster PostgreSQL Counting

Last but not least, please always provide the output of an extended explain as @a_horse_with_no_name already recommended, e.g.:

EXPLAIN (ANALYZE, BUFFERS) SELECT count(event_id) FROM events;
like image 33
Ancoron Avatar answered Oct 24 '22 09:10

Ancoron