Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating multicolumn index in PostgreSQL, containing both scalar and array columns

In order to optimize complex PostgreSQL query I've tried to create an index containing both scalar strings and arrays and supporting array operations (@>, <@ and &&).

But I only managed to create a BTREE index so far:

CREATE INDEX idx1
  ON "MyTable"
  USING btree
  ("Char_1", "Array_1", "Array_2", "Array_3", "Char_2");

which have no support for array operations (@>, <@ and &&).

I've tried to use GIN and GiST (using btree_gin and btree_gist extensions), but I found no way to use both scalar and array columns in the same index.

It looks like GIN doesn't support scalars:

ERROR:  data type character has no default operator class for access method "gin"
HINT:  You must specify an operator class for the index or define a default operator class for the data type.

while GiST does't support arrays:

ERROR:  data type character varying[] has no default operator class for access method "gist"
HINT:  You must specify an operator class for the index or define a default operator class for the data type.

The only way I found to create such an index is by using to_tsvector function to convert all scalar strings to tsvector datatype. But I don't need full text search here. I even tried to create my own operator class, but quickly realized that it's beyond me.

Is there any way to create multicolumn GIN/GiST index, containing both scalar strings and arrays?

like image 537
Leonid Beschastny Avatar asked Aug 11 '15 15:08

Leonid Beschastny


1 Answers

You need to install the additional module btree_gin or btree_gist respectively, which provide the missing operator classes.

Run once per database:

CREATE EXTENSION btree_gin;  -- or btree_gist

Then you should be able to create your multicolumn index:

CREATE INDEX idx1 ON "MyTable" USING gin
  ("Varchar_1", "Array_1", "Array_2", "Array_3", "Varchar_2");

See:

  • Multicolumn index on 3 fields with heterogenous data types

As for indexes on array types: GIN is the perfect index type for those. The manual:

GIN indexes are inverted indexes which can handle values that contain more than one key, arrays for example.

Bold emphasis mine. The operators @>, <@ and && are defined for various data types. Some of them cooperate with GiST indexes as well. But with arrays as operands, it's typically GIN indexes. See:

  • Use PostgreSQL builtin operator <@ after including extension intarray

The data type character is most probably not what you want. See:

  • Any downsides of using data type “text” for storing strings?
like image 84
Erwin Brandstetter Avatar answered Sep 22 '22 08:09

Erwin Brandstetter