Solution for speeding up a slow SELECT DISTINCT query in Postgres

People also ask

How can I make select distinct faster?

You probably don't want to hear this, but the best option to speed up SELECT DISTINCT is to avoid DISTINCT to begin with. In many cases (not all!) it can be avoided with better database-design or better queries. Sometimes, GROUP BY is faster, because it takes a different code path.

Does select distinct slow down a query?

Very few queries may perform faster in SELECT DISTINCT mode, and very few will perform slower (but not significantly slower) in SELECT DISTINCT mode but for the later case it is likely that the application may need to examine the duplicate cases, which shifts the performance and complexity burden to the application.

Is GROUP BY faster than distinct Postgres?

From experiments, I founded that the GROUP BY is 10+ times faster than DISTINCT. They are different. So what I learned is: GROUP-BY is anyway not worse than DISTINCT, and it is better sometimes.

Oftentimes, you can make such queries run faster by working around the distinct by using a group by instead:

select my_table.foo 
from my_table 
where [whatever where conditions you want]
group by foo;

Your DISTINCT is causing it to sort the output rows in order to find duplicates. If you put an index on the column(s) selected by the query, the database may be able to read them out in index order and save the sort step. A lot will depend on the details of the query and the tables involved-- your saying you "know the problem is with the DISTINCT" really limits the scope of available answers.

You can try increasing the work_mem setting, depending on the size of Your dataset It can cause switching the query plan to hash aggregates, which are usually faster.

But before setting it too high globally, first read up on it. You can easily blow up Your server, because the max_connections setting acts as a multiplier to this number.

This means that if you were to set work_mem = 128MB and you set max_connections = 100 (the default), you should have more than 12.8GB of RAM. You're essentially telling the server that it can use that much for performing queries (not even considering any other memory use by Postgres or otherwise).

Related questions
                            
                                Drop foreign key only if it exists
                            
                                SQL ignore part of WHERE if parameter is null
                            
                                sql query to get earliest date
                            
                                usage of select null?
                            
                                ORA-00918: column ambiguously defined in SELECT *
                            
                                Conversion failed when converting the varchar value 'simple, ' to data type int
                            
                                Query in MySQL for string fields with a specific length
                            
                                IF EXISTS condition not working with PLSQL
                            
                                How to group by a Calculated Field
                            
                                What is dynamic SQL?
                            
                                How to find values in all caps in SQL Server?
                            
                                Extracting the total number of seconds from an interval data-type
                            
                                How to detect if a string contains special characters?
                            
                                Group by alias (Oracle)
                            
                                Using SQL Server as a DB queue with multiple clients
                            
                                How can I configure Entity Framework to automatically trim values retrieved for specific columns mapped to char(N) fields?
                            
                                Difference between Alter and Update SQL
                            
                                Check if two "select"s are equivalent
                            
                                Left join ON condition AND other condition syntax in Doctrine
                            
                                Using the correct, or preferable, not equal operator in MySQL

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Solution for speeding up a slow SELECT DISTINCT query in Postgres

Tags:

sql

database

postgresql

database-optimization

People also ask

Recent Activity

Donate For Us