I have a table named events in my Postgresql 9.5 database. And this table has about 6 million records. I am runnig a <code>select count(event_id) from events</code> query. But this query takes 40seconds. This is very long time for a database. My <code>event_id</code> field of table is primary key and indexed. Why this takes very long time? (Server is ubuntu vm on vmware has 4cpu) Explain: <pre class="prettyprint"><code>"Aggregate (cost=826305.19..826305.20 rows=1 width=0) (actual time=24739.306..24739.306 rows=1 loops=1)" " Buffers: shared hit=13 read=757739 dirtied=53 written=48" " -> Seq Scan on event_source (cost=0.00..812594.55 rows=5484255 width=0) (actual time=0.014..24087.050 rows=6320689 loops=1)" " Buffers: shared hit=13 read=757739 dirtied=53 written=48" "Planning time: 0.369 ms" "Execution time: 24739.364 ms" </code></pre>

There are multiple factors playing a big role in the decision for PostgreSQL how to execute the <code>count()</code>, but first of all, the column you use inside the <code>count</code> function does not matter. In fact, if you don't need <code>DISTINCT</code> count, stick with <code>count(*)</code>. You can try the following to force an index-only scan: <pre class="prettyprint lang-sql prettyprint-override"><code>SELECT count(*) FROM (SELECT event_id FROM events) t; </code></pre> ...if that still results in a sequential scan, than most likely the index is not much smaller than the table itself. To still see how an index-only scan would perform, you can enforce it with: <pre class="prettyprint lang-sql prettyprint-override"><code>SELECT count(*) FROM (SELECT event_id FROM events ORDER BY 1) t; </code></pre> IF that is not much faster, you should also consider an upgrade of the PostgreSQL to at least version 9.6, which introduces parallel sequential scans to speed up these things. In addition, you can achieve dramatic speedups choosing from a variety of techniques to provide counts which largely depend on your use-case and your requirements: <ul> <li>Faster PostgreSQL Counting</li> </ul> Last but not least, please always provide the output of an extended explain as @a_horse_with_no_name already recommended, e.g.: <pre class="prettyprint lang-sql prettyprint-override"><code>EXPLAIN (ANALYZE, BUFFERS) SELECT count(event_id) FROM events; </code></pre>

Postgresql select count query takes long time

Tags:

sql

postgresql

postgresql-9.5

I have a table named events in my Postgresql 9.5 database. And this table has about 6 million records.

I am runnig a select count(event_id) from events query. But this query takes 40seconds. This is very long time for a database. My event_id field of table is primary key and indexed. Why this takes very long time? (Server is ubuntu vm on vmware has 4cpu)

Explain:

"Aggregate  (cost=826305.19..826305.20 rows=1 width=0) (actual time=24739.306..24739.306 rows=1 loops=1)"
"  Buffers: shared hit=13 read=757739 dirtied=53 written=48"
"  ->  Seq Scan on event_source  (cost=0.00..812594.55 rows=5484255 width=0) (actual time=0.014..24087.050 rows=6320689 loops=1)"
"        Buffers: shared hit=13 read=757739 dirtied=53 written=48"
"Planning time: 0.369 ms"
"Execution time: 24739.364 ms"

249

asked Mar 06 '19 08:03

barteloma

2 Answers

I know that this is an old question and the existing answer covers the vast majority of info around this, but I just ran into a situation where a table of 1.3 million rows was taking about 35 seconds to perform a simple SELECT COUNT(*). None of the other solutions helped. The issue ended up being that the table was just bloated and hadn't been vacuumed, so Postgres couldn't figure out the most optimal way to query the data. After I ran this, the query time dropped down to about 25ms!

VACUUM (ANALYZE, VERBOSE, FULL) my_table_name;

Hope this helps someone else!

132

answered Oct 24 '22 09:10

Alec Sanger

There are multiple factors playing a big role in the decision for PostgreSQL how to execute the count(), but first of all, the column you use inside the count function does not matter. In fact, if you don't need DISTINCT count, stick with count(*).

You can try the following to force an index-only scan:

SELECT count(*) FROM (SELECT event_id FROM events) t;

...if that still results in a sequential scan, than most likely the index is not much smaller than the table itself. To still see how an index-only scan would perform, you can enforce it with:

SELECT count(*) FROM (SELECT event_id FROM events ORDER BY 1) t;

IF that is not much faster, you should also consider an upgrade of the PostgreSQL to at least version 9.6, which introduces parallel sequential scans to speed up these things.

In addition, you can achieve dramatic speedups choosing from a variety of techniques to provide counts which largely depend on your use-case and your requirements:

Faster PostgreSQL Counting

Last but not least, please always provide the output of an extended explain as @a_horse_with_no_name already recommended, e.g.:

EXPLAIN (ANALYZE, BUFFERS) SELECT count(event_id) FROM events;

answered Oct 24 '22 09:10

Ancoron

Related questions
                            
                                Access DB update one table with value from another
                            
                                PostgreSQL: Defining a primary key on a large database
                            
                                Oracle: excluding updates of one column for firing a trigger
                            
                                MYSQL SUM GROUP BY
                            
                                Visual Studio - Open a SQL file with SQL Management Studio in an existing SSMS window?
                            
                                Check if value in column for all rows is exactly value
                            
                                SQL. Average entries per month
                            
                                Postgresql select until certain total amount is reached
                            
                                Nested aggregate functions with grouping in postgresql
                            
                                MySQL Delete Records Older Than X Minutes?
                            
                                Oracle SQL: variables used in place of table names
                            
                                Whats wrong with this SQL statement for table variable bulk insert
                            
                                Select the top 1 row from each group
                            
                                Find all duplicate records in SQL table with Entity Framework
                            
                                How to select the value of a variable in Oracle?
                            
                                Cannot find the symmetric key '', because it does not exist or you do not have permission
                            
                                SQL Server project executing multiple script post deploy
                            
                                How to insert N rows of default values into a table
                            
                                Add a sequence number for each element in a group using an Oracle SQL query
                            
                                How to calculate seconds between two timestamps in Impala?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With