I'm trying to optimize some of my selects using the explain analyze, and I can't understand why postgresql uses a sequentials scan instead of index scan:
explain analyze SELECT SUM(a.deure)-SUM(a.haver) as Value FROM assentaments a
LEFT JOIN comptes c ON a.compte_id = c.id WHERE c.empresa_id=2 AND c.nivell=11 AND
(a.data >='2007-01-01' AND a.data <='2007-01-31') AND c.codi_compte LIKE '6%';
------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=44250.26..44250.27 rows=1 width=12)
(actual time=334.054..334.054 rows=1 loops=1)
-> Nested Loop (cost=0.00..44249.20 rows=211 width=12)
(actual time=65.277..333.179 rows=713 loops=1)
-> Seq Scan on comptes c (cost=0.00..8001.72 rows=118 width=4)
(actual time=0.053..64.287 rows=236 loops=1)
Filter: (((codi_compte)::text ~~ '6%'::text) AND
(empresa_id = 2) AND (nivell = 11))
-> Index Scan using index_compte_id on assentaments a
(cost=0.00..307.16 rows=2 width=16) (actual time=0.457..1.138 rows=3 loops=236)
Index Cond: (a.compte_id = c.id)
Filter: ((a.data >= '2007-01-01'::date) AND (a.data <= '2007-01-31'::date))
Total runtime: 334.104 ms
(8 rows)
I've created a custom index:
CREATE INDEX "index_multiple" ON "public"."comptes" USING btree(codi_compte ASC NULLS LAST,
empresa_id ASC NULLS LAST, nivell ASC NULLS LAST);
And also I've created three new index for this three fields on comptes table just to check If it takes an index scan, but not, the result is the same:
CREATE INDEX "index_codi_compte" ON "public"."comptes" USING btree(codi_compte ASC NULLS LAST);
CREATE INDEX "index_comptes" ON "public"."comptes" USING btree(codi_compte ASC NULLS LAST);
CREATE INDEX "index_multiple" ON "public"."comptes" USING btree(codi_compte ASC NULLS LAST, empresa_id ASC NULLS LAST, nivell ASC NULLS LAST);
CREATE INDEX "index_nivell" ON "public"."comptes" USING btree(nivell ASC NULLS LAST);
thanks!
m.
assentaments.id and assentaments.data have their index also
select count(*) FROM comptes => 148498
select count(*) from assentaments => 2128771
select count(distinct(codi_compte)) FROM comptes => 137008
select count(distinct(codi_compte)) FROM comptes WHERE codi_compte LIKE '6%' => 368
select count(distinct(codi_compte)) FROM comptes WHERE codi_compte LIKE '6%' AND empresa_id=2; => 303
There are a few (normally good) reasons for Postgres choosing a sequential scan even when it could use an index scan: If the table is small. If a large proportion of the rows are being returned. If there is a LIMIT clause and it thinks it can abort early.
If you need only a single table row, an index scan is much faster than a sequential scan. If you need the whole table, a sequential scan is faster than an index scan.
A full table scan (also known as a sequential scan) is a scan made on a database where each row of the table is read in a sequential (serial) order and the columns encountered are checked for the validity of a condition.
3) index scan is faster than a table scan because they look at sorted data and query optimizers know when to stop and look for another range. 4) index seek is the fastest way to retrieve data and it comes into the picture when your search criterion is very specific.
If you want an index on TEXT to index LIKE queries, you need to create it with text_pattern_ops, like this :
test=> CREATE TABLE t AS SELECT n::TEXT FROM generate_series( 1,100000 ) n;
test=> CREATE INDEX tn ON t(n);
test=> VACUUM ANALYZE t;
test=> EXPLAIN ANALYZE SELECT * FROM t WHERE n LIKE '123%';
QUERY PLAN
--------------------------------------------------------------------------------------------------
Seq Scan on t (cost=0.00..1693.00 rows=10 width=5) (actual time=0.027..14.631 rows=111 loops=1)
Filter: (n ~~ '123%'::text)
Total runtime: 14.664 ms
test=> CREATE INDEX tn2 ON t(n text_pattern_ops);
CREATE INDEX
Temps : 267,589 ms
test=> EXPLAIN ANALYZE SELECT * FROM t WHERE n LIKE '123%';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=5.25..244.79 rows=10 width=5) (actual time=0.089..0.121 rows=111 loops=1)
Filter: (n ~~ '123%'::text)
-> Bitmap Index Scan on tn2 (cost=0.00..5.25 rows=99 width=0) (actual time=0.077..0.077 rows=111 loops=1)
Index Cond: ((n ~>=~ '123'::text) AND (n ~<~ '124'::text))
Total runtime: 0.158 ms
see details here :
http://www.postgresql.org/docs/9.1/static/indexes-opclass.html
If you do not want to create an additional index, and column is a TEXT, you can replace "compte LIKE '6%'" by "compte >= '6' AND compte < '7'" which is a simple index range condition.
test=> EXPLAIN ANALYZE SELECT * FROM t WHERE n >= '123' AND n < '124';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Index Scan using tn on t (cost=0.00..126.74 rows=99 width=5) (actual time=0.030..0.127 rows=111 loops=1)
Index Cond: ((n >= '123'::text) AND (n < '124'::text))
Total runtime: 0.153 ms
In your case this solution is probably better.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With