Slow query optimisation in Postgres

Tags:

We have a performance issue with a specific SQL query and we're trying to figure out how could we improve here. It's execution time on is about 20 - 100 seconds!

Here is the query and it's explain:

SELECT  "jobs".* FROM "jobs"
  WHERE "jobs"."status" IN (1, 2, 3, 4)
  ORDER BY "jobs"."due_date" ASC
  LIMIT 5;


Limit  (cost=0.42..1844.98 rows=5 width=2642) (actual time=16927.150..18151.643 rows=1 loops=1)
   ->  Index Scan using index_jobs_on_due_date on jobs  (cost=0.42..1278647.41 rows=3466 width=2642) (actual time=16927.148..18151.641 rows=1 loops=1)
         Filter: (status = ANY ('{1,2,3,4}'::integer[]))
         Rows Removed by Filter: 595627
 Planning time: 0.205 ms
 Execution time: 18151.684 ms

We are using PostgreSQL 9.6.11 on AWS RDS.

In a table we have ~500K rows. Fields realted to the query are:

due_date(timestamp without time zone, can be null)
status(integer, not null)

We have the following indexes:

CREATE INDEX index_jobs_on_due_date ON public.jobs USING btree (due_date)
CREATE INDEX index_jobs_on_due_date_and_status ON public.jobs USING btree (due_date, status)
CREATE INDEX index_jobs_on_status ON public.jobs USING btree (status)
CREATE UNIQUE INDEX jobs_pkey ON public.jobs USING btree (id)

Thank you in advance, - Jack

899

asked Jun 04 '19 10:06

Jack Juiceson

1 Answers

For this query:

SELECT  j.*
FROM "jobs" j
WHERE j."status" IN (1, 2, 3, 4)
ORDER BY "jobs"."due_date" ASC
LIMIT 5;

The "obvious" index is on (status). But that may not help. The goal is to get rid of the sorting. So, you can rewrite the query and use an index jobs(status, due_date):

select j.*
from ((select j.*
       from jobs j
       where j.status = 1
       order by j.due_date asc
       limit 5
      ) union all
      (select j.*
       from jobs j
       where j.status = 2
       order by j.due_date asc
       limit 5
      ) union all
      (select j.*
       from jobs j
       where j.status = 3
       order by j.due_date asc
       limit 5
      ) union all
      (select j.*
       from jobs j
       where j.status = 4
       order by j.due_date asc
       limit 5
      )
     ) j
order by due_date
limit 5;

The subqueries should each use the composite index. The final sort would then be on (at most) 20 rows, which should be fast).

EDIT:

Here is a related idea, with the same index:

SELECT j.*
FROM (SELECT  j.*,
              ROW_NUMBER() OVER (PARTITION BY j.status ORDER BY j.due_date ASC) as seqnum
      FROM "jobs" j
     ) j
WHERE j.status in (1, 2, 3, 4) AND seqnum <= 5
ORDER BY j.due_date ASC
LIMIT 5;

This can use the index for the ROW_NUMBER() calculation. That might require a full table scan of the table. But, the final sort will be limited to 20 rows, so the final sort is eliminated.

answered Sep 24 '22 11:09

Gordon Linoff

Related questions
                            
                                Can aggregate filter expressions not use indices?
                            
                                Desired port for google cloudSQL connection is not able to be used
                            
                                Table-Valued Parameter without using a stored procedure
                            
                                Why doesn't Oracle throw an "ambiguous column reference" here?
                            
                                Using SSIS and SSMS - Attempting to create an ETL design document that details source and destination columns and their relations
                            
                                Parameterised OFFSET FETCH NEXT query (EF Core) has 10x slower performance
                            
                                convert sql to dsl elasticsearch query
                            
                                Get numeric values from based on specific substring phrase
                            
                                how to execute the procedure one after another in oracle sql developer using chains?
                            
                                How to create sql transaction audit software without using fn_dblog or fn_dump_dblog directly on database
                            
                                Should I disable bitmapscan on PostgreSQL when using SSD?
                            
                                Validation failed for one or more entities in Entity Framework for nullable boolean property
                            
                                update a table from insert result
                            
                                "Error op_response:0" with prepared statement
                            
                                SqlDataAdapter.Fill() vs DataTable.Load()
                            
                                Performance problems with EF core QueryFilter
                            
                                The best way to insert an integer into my table using previous/next buttons
                            
                                How to properly apply recursive CTE?
                            
                                date format issue with google data studio (postgres)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Slow query optimisation in Postgres

Tags:

sql

postgresql

query-optimization

sql-optimization

Jack Juiceson

People also ask

1 Answers

Gordon Linoff

Recent Activity

Donate For Us