Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can the Postgres array overlap (&&) operator use an index?

Tags:

postgresql

We have a table with an indexed array column:

CREATE TABLE mention (
  id SERIAL,
  phraseIds integer[],
  PRIMARY KEY (id)
);
CREATE INDEX indx_mentions_phraseIds on mention USING GIN (phraseids public.gin__int_ops);

Queries using the "overlaps" operator on this column don't seem to use the index:

explain analyze select m.id FROM mention m WHERE m.phraseIds && ARRAY[11638,11639];

Seq Scan on mention m  (cost=0.00..933723.44 rows=1404 width=4) (actual time=103.018..3751.525 rows=1101 loops=1)
Filter: (phraseids && '{11638,11639}'::integer[])
Rows Removed by Filter: 7019974
Total runtime: 3751.618 ms

Is it possible to get Postgresql to use the index? Or should we be doing something else?

Update: I repeated the test with 'SET enable_seqscan TO off' and the index is still not used.

Update: I should have mentioned that I am using 9.2 with the intarray extension.

Update: It seems that the intarray extension is part of this problem. I re-created the table without using the intarray extension and the index is used as expected. Anyone know how to get the index to be used with the intarray extension? The docs (http://www.postgresql.org/docs/9.2/static/intarray.html) say that indexes are supported for &&.

like image 497
David Tinker Avatar asked Aug 15 '13 17:08

David Tinker


People also ask

Does PostgreSQL support array?

PostgreSQL allows columns of a table to be defined as variable-length multidimensional arrays. Arrays of any built-in or user-defined base type, enum type, composite type, range type, or domain can be created.

What is array length in PostgreSQL?

array_length() is a system function returning the length of the requested array dimension. array_length() was added in PostgreSQL 8.4.

What is Unnest in PostgreSQL?

PostgreSQL UNNEST() function This function is used to expand an array to a set of rows.


1 Answers

I built a similar table in PostgreSQL 9.2; the difference was USING GIN (phraseids); I don't seem to have int_ops available in this context for some reason. I loaded a few thousand rows of random (ish) data.

Setting enable_seqscan off let PostgreSQL use the index.

PostgreSQL calculated the cost of a sequential scan to be less than the cost of a bitmap heap scan. The actual time of a sequential scan was 10% the actual time of a bitmap heap scan, but the total run time for a sequential scan was a little more than the total run time of a bitmap heap scan.

like image 197
Mike Sherrill 'Cat Recall' Avatar answered Sep 20 '22 23:09

Mike Sherrill 'Cat Recall'