I have PostgreSQL 9.5.9 and two tables: table1 and table2 <pre class="prettyprint"><code> Column | Type | Modifiers ------------+--------------------------------+------------------------------------------- id | integer | not null status | integer | not null table2_id | integer | start_date | timestamp(0) without time zone | default NULL::timestamp without time zone Indexes: "table1_pkey" PRIMARY KEY, btree (id) "table1_start_date" btree (start_date) "table1_table2" btree (table2_id) Foreign-key constraints: "fk_t1_t2" FOREIGN KEY (table2_id) REFERENCES table2(id) Column | Type | Modifiers --------+-------------------------+--------------------------------- id | integer | not null name | character varying(2000) | default NULL::character varying Indexes: "table2_pkey" PRIMARY KEY, btree (id) Referenced by: TABLE "table1" CONSTRAINT "fk_t1_t2" FOREIGN KEY (table2_id) REFERENCES table2(id) </code></pre> table2 contains only 3 rows; table1 contains about 400000 rows and only half of them have some value in table_2_id column. The query is fast enough when i select some values from table1 ordered by start_date column because table1_start_date index is effectively used: <pre class="prettyprint"><code>SELECT t1.* FROM table1 AS t1 ORDER BY t1.start_date DESC LIMIT 25 OFFSET 150000; </code></pre> EXPLAIN ANALYZE result <pre class="prettyprint"><code> Limit (cost=7797.40..7798.70 rows=25 width=20) (actual time=40.994..41.006 rows=25 loops=1) -> Index Scan Backward using table1_start_date on table1 t1 (cost=0.42..20439.74 rows=393216 width=20) (actual time=0.078..36.251 rows=150025 loops=1) Planning time: 0.097 ms Execution time: 41.033 ms </code></pre> But the query become very slow when i add LEFT JOIN to fetch values from table2 too: <pre class="prettyprint"><code>SELECT t1.*, t2.* FROM table1 AS t1 LEFT JOIN table2 AS t2 ON t2.id = t1.table2_id ORDER BY t1.start_date DESC LIMIT 25 OFFSET 150000; </code></pre> EXPLAIN ANALYZE result: <pre class="prettyprint"><code> Limit (cost=33690.80..33696.42 rows=25 width=540) (actual time=191.282..191.320 rows=25 loops=1) -> Nested Loop Left Join (cost=0.57..88317.50 rows=393216 width=540) (actual time=0.028..184.537 rows=150025 loops=1) -> Index Scan Backward using table1_start_date on table1 t1 (cost=0.42..20439.74 rows=393216 width=20) (actual time=0.018..35.196 rows= 150025 loops=1) -> Index Scan using table2_pkey on table2 t2 (cost=0.14..0.16 rows=1 width=520) (actual time=0.000..0.001 rows=1 loops=150025) Index Cond: (id = t1.table2_id) Planning time: 0.210 ms Execution time: 191.357 ms </code></pre> Why query time increased from 32ms to 191ms? As i understand, LEFT JOIN can not affect to result. So, we can select 25 rows from table1 (LIMIT 25) first and then join rows from table2 Execution time of the query shouldn't significantly increase. There are no some tricky conditions which can break the use of index, etc. I don't understand completely the EXPLAIN ANALYZE for second query, but it seems like postgres analyzer decided to "perform join and then filter" instead of "filter and then join". In this way the query is too slow. What is the problem?

It just doesn't know that limit should apply to <code>table1</code> instead of result of join, so it fetches minimum required rows, that is 150025 and then does 150025 loops on <code>table2</code>. If you do subselect with limit on <code>table1</code> and join <code>table2</code> to that subselect you should get what you want. <pre class="prettyprint"><code>SELECT t1.*, t2.* FROM (SELECT * FROM table1 ORDER BY start_date DESC LIMIT 25 OFFSET 150000) AS t1 LEFT JOIN table2 AS t2 ON t2.id = t1.table2_id; </code></pre>

Postgres: why LEFT JOIN affects to query plan?

Tags:

sql

postgresql

sql-execution-plan

I have PostgreSQL 9.5.9 and two tables: table1 and table2

Click to copy

 Column   |              Type              |                 Modifiers                 
------------+--------------------------------+-------------------------------------------
 id         | integer                        | not null
 status     | integer                        | not null
 table2_id  | integer                        | 
 start_date | timestamp(0) without time zone | default NULL::timestamp without time zone
Indexes:
    "table1_pkey" PRIMARY KEY, btree (id)
    "table1_start_date" btree (start_date)
    "table1_table2" btree (table2_id)
Foreign-key constraints:
    "fk_t1_t2" FOREIGN KEY (table2_id) REFERENCES table2(id)


 Column |          Type           |            Modifiers            
--------+-------------------------+---------------------------------
 id     | integer                 | not null
 name   | character varying(2000) | default NULL::character varying
Indexes:
    "table2_pkey" PRIMARY KEY, btree (id)
Referenced by:
    TABLE "table1" CONSTRAINT "fk_t1_t2" FOREIGN KEY (table2_id) REFERENCES table2(id)

table2 contains only 3 rows; table1 contains about 400000 rows and only half of them have some value in table_2_id column.

The query is fast enough when i select some values from table1 ordered by start_date column because table1_start_date index is effectively used:

Click to copy

SELECT t1.*
FROM table1 AS t1 
ORDER BY t1.start_date DESC
LIMIT 25 OFFSET 150000;

EXPLAIN ANALYZE result

Click to copy

   Limit  (cost=7797.40..7798.70 rows=25 width=20) (actual time=40.994..41.006 rows=25 loops=1)
   ->  Index Scan Backward using table1_start_date on table1 t1  (cost=0.42..20439.74 rows=393216 width=20) (actual time=0.078..36.251 rows=150025
 loops=1)
 Planning time: 0.097 ms
 Execution time: 41.033 ms

But the query become very slow when i add LEFT JOIN to fetch values from table2 too:

Click to copy

SELECT t1.*, t2.*
FROM table1 AS t1
LEFT JOIN table2 AS t2 ON t2.id = t1.table2_id
ORDER BY t1.start_date DESC
LIMIT 25 OFFSET 150000;

EXPLAIN ANALYZE result:

Click to copy

 Limit  (cost=33690.80..33696.42 rows=25 width=540) (actual time=191.282..191.320 rows=25 loops=1)
   ->  Nested Loop Left Join  (cost=0.57..88317.50 rows=393216 width=540) (actual time=0.028..184.537 rows=150025 loops=1)
         ->  Index Scan Backward using table1_start_date on table1 t1  (cost=0.42..20439.74 rows=393216 width=20) (actual time=0.018..35.196 rows=
150025 loops=1)
         ->  Index Scan using table2_pkey on table2 t2  (cost=0.14..0.16 rows=1 width=520) (actual time=0.000..0.001 rows=1 loops=150025)
               Index Cond: (id = t1.table2_id)
 Planning time: 0.210 ms
 Execution time: 191.357 ms

Why query time increased from 32ms to 191ms? As i understand, LEFT JOIN can not affect to result. So, we can select 25 rows from table1 (LIMIT 25) first and then join rows from table2 Execution time of the query shouldn't significantly increase. There are no some tricky conditions which can break the use of index, etc.

I don't understand completely the EXPLAIN ANALYZE for second query, but it seems like postgres analyzer decided to "perform join and then filter" instead of "filter and then join". In this way the query is too slow. What is the problem?

614

asked Nov 22 '17 10:11

ox160d05d

Video Answer

1 Answers

It just doesn't know that limit should apply to table1 instead of result of join, so it fetches minimum required rows, that is 150025 and then does 150025 loops on table2. If you do subselect with limit on table1 and join table2 to that subselect you should get what you want.

Click to copy

SELECT t1.*, t2.*
FROM (SELECT *
        FROM table1
       ORDER BY start_date DESC
       LIMIT 25 OFFSET 150000) AS t1
LEFT JOIN table2 AS t2 ON t2.id = t1.table2_id;

answered Nov 11 '22 03:11

Łukasz Kamiński

Related questions
                            
                                SQL Server 2008 Merge Statement Multiple Match Conditions
                            
                                Materialized view fast refresh with HAVING clause?
                            
                                How can you auto-increment a non primary ID column scoped to another table?
                            
                                Oracle top-n query sort performance
                            
                                Is there a way to group SQL functions?
                            
                                SQL Summing digits of a number
                            
                                SQLMAP - Post JSON data as body
                            
                                Get back newly inserted row in Postgres with sqlx
                            
                                SQL Server why is index not used with OR
                            
                                JOOQ & Firebird - Implementation Limit Exceeded
                            
                                How to wrap PL/SQL source code in Oracle?
                            
                                SQL INNER JOIN vs Where Exists Performance Consideration
                            
                                Oracle 12c documentation for changes / new features to SQL
                            
                                LINQ Query to check for a predicate in all columns in a table
                            
                                SQL syntax error using jdbc
                            
                                Directed graph in Oracle SQL using recursive query visiting each node only once
                            
                                How scn map to timestamp using sys.smon_scn_time in Oracle?
                            
                                How to insert nested json data in mysql table?
                            
                                How to represent GROUP BY with HAVING COUNT(*)>1 in relational algebra?
                            
                                How can I use Windows authentication in MVC but use the newer identity database tables for role storage?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Postgres: why LEFT JOIN affects to query plan?

Tags:

sql

postgresql

sql-execution-plan

ox160d05d

People also ask

Video Answer

1 Answers

Łukasz Kamiński

Recent Activity

Donate For Us