Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres chooses much slower Seq Scan instead of Index Scan

I started by ensuring the planner had updated stats:

my_db=> vacuum analyze;
VACUUM
Time: 1401.958 ms

When only selecting foos.bar_id, the query executes fine with an Index Only Scan on that column:

my_db=> EXPLAIN ANALYZE SELECT foos.bar_id FROM foos INNER JOIN bar_ids ON foos.bar_id = bar_ids.id;
 QUERY PLAN                                                                            
 Nested Loop  (cost=0.43..16203.46 rows=353198 width=4) (actual time=0.045..114.746 rows=196205 loops=1)
   ->  Seq Scan on bar_ids  (cost=0.00..16.71 rows=871 width=4) (actual time=0.005..0.195 rows=871 loops=1)
   ->  Index Only Scan using index_foos_on_bar_id on foos  (cost=0.43..14.80 rows=378 width=4) (actual time=0.003..0.055 rows=225 loops=871)
         Index Cond: (bar_id = bar_ids.id)
         Heap Fetches: 0
 Planning time: 0.209 ms
 Execution time: 144.364 ms
(7 rows)

Time: 145.620 ms

However, adding foos.id causes the query to choose an extremely slow Seq Scan:

my_db=> EXPLAIN ANALYZE SELECT foos.id, foos.bar_id FROM foos INNER JOIN bar_ids ON foos.bar_id = bar_ids.id;
 QUERY PLAN                                                            
 Hash Join  (cost=27.60..221339.63 rows=353198 width=8) (actual time=294.091..3341.926 rows=196205 loops=1)
   Hash Cond: (foos.bar_id = bar_ids.id)
   ->  Seq Scan on foos  (cost=0.00..182314.70 rows=7093070 width=8) (actual time=0.004..1855.900 rows=7111807 loops=1)
   ->  Hash  (cost=16.71..16.71 rows=871 width=4) (actual time=0.454..0.454 rows=866 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 39kB
         ->  Seq Scan on bar_ids  (cost=0.00..16.71 rows=871 width=4) (actual time=0.002..0.222 rows=871 loops=1)
 Planning time: 0.237 ms
 Execution time: 3371.622 ms
(8 rows)

Time: 3373.150 ms

Disabling Seq Scan forces an Index Scan on the same index, which is an order of magnitude faster than the Seq Scan:

my_db=> set enable_seqscan=false;
SET
Time: 0.801 ms
my_db=> EXPLAIN ANALYZE SELECT foos.id, foos.bar_id FROM foos INNER JOIN bar_ids ON foos.bar_id = bar_ids.id;
 QUERY PLAN                                                                      
 Nested Loop  (cost=10000000000.43..10000439554.99 rows=353198 width=8) (actual time=0.028..171.632 rows=196205 loops=1)
   ->  Seq Scan on bar_ids  (cost=10000000000.00..10000000016.71 rows=871 width=4) (actual time=0.005..0.212 rows=871 loops=1)
   ->  Index Scan using index_foos_on_bar_id on foos  (cost=0.43..500.86 rows=378 width=8) (actual time=0.003..0.118 rows=225 loops=871)
         Index Cond: (bar_id = bar_ids.id)
 Planning time: 0.186 ms
 Execution time: 201.958 ms
(6 rows)

Time: 203.185 ms

Other answers say the poor planning is due to bad statistics. My statistics are up to date. What gives?

bar_ids is a temporary table, which might be related to the insane cost estimates in the last query (Seq Scan on bar_ids (cost=10000000000.00..10000000016.71), but explicitly running ANALYZE bar_ids doesn't change the query plan.

like image 355
rcrogers Avatar asked Nov 22 '16 16:11

rcrogers


1 Answers

Following up from the comments on the OP here.

In the case of the first query when you're only selecting foos.bar_id the executor is able to fulfill this with an index only scan, which is nifty. Tacking on another column (that's not covered in the index) to the select list, however, means that to continue using this index implies the typical "double read" situation where we first read the index page then read the table page to get the remaining column's value, which implies a potential for quite a bit of random IO.

The setting random_page_cost should be taken relative to seq_page_cost to mean (loosely) that random IO is random_page_cost times more expensive than sequential IO (given seq_page_cost=1). With modern drives, random IO isn't that expensive, so lowering random_page_cost can make index scans more preferable. Finding the "best" value for this is tricky, but starting around 2 is a decent rule of thumb.

E: I didn't have time to add these extra thoughts earlier but it's been bugging me. If you happen upon this because you're experiencing similar issues, please don't just start fiddling with this config out the gate. In this situation it seemed like a fruitful approach because OP had already clarified that statistics were fresh and reasonable, that the index scan was viable, and that the planner had a stronger preference for the table scan even though that was clearly worse (in terms of empirical evidence). This config isn't a silver bullet and if you are having performance issues that don't really match the ones given here then this solution may not be for you! Please consider that other situations may require query rewrites or changes to other configuration items (especially those related to memory usage and allocation).

like image 159
yieldsfalsehood Avatar answered Oct 21 '22 03:10

yieldsfalsehood