Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multicolumn index on 3 fields with heterogenous data types

I have a postgres table with 3 fields:

  • a : postgis geometry
  • b : array varchar[]
  • c : integer

and I have a query that involves all of them. I would like to add a multicolumn index to speed it up but I cannot as the 3 fields cannot go under the same index because of their nature.

What is the strategy in this case? Adding 3 indexes gist, gin and btree and postgres will use them all during the query?

like image 597
nourdine Avatar asked Mar 26 '14 10:03

nourdine


People also ask

Which is the index used for multiple fields?

A multicolumn index is an index based on the values in multiple columns of a table.

Can we create index on multiple columns?

If you specify the columns in the right order in the index definition, a single composite index can speed up several kinds of queries on the same table. A multiple-column index can be considered a sorted array, the rows of which contain values that are created by concatenating the values of the indexed columns.

What is a multicolumn index?

Multicolumn indexes are indexes that store data on up to 32 columns. When creating a multicolumn index, the column order is very important. This is due to the structure that multicolumn indexes possess. Multicolumn indexes are structured to have a hierarchical structure.

When we combine multiple columns in a single index it is known as a index?

A concatenated index, also known as multi-column, composite or combined index, is one index across multiple columns.


2 Answers

Single-column index

Postgres can combine multiple indexes very efficiently in a single query with bitmap index scans. Most of the time, the most selective index is picked (or two, combined with bitmap index scans) and the rest is filtered. Once the result set is narrow enough, it's not efficient to scan more indexes.

Multicolumn index

It is still faster to have a perfectly matching multicolumn index, but not by orders of magnitude.
Since you want to include an array type I suggest to use a GIN index. AFAIK, operator classes are missing for general-purpose GiST indexes on array type. (The exception being intarray for integer arrays.)

To include the integer column, first install the additional module btree_gin, which provides the necessary GIN operator classes. Run once per database:

CREATE EXTENSION btree_gin;

Then you should be able to create your multicolumn index:

CREATE INDEX tbl_abc_gin_idx ON tbl USING GIN(a, b, c);

The order of index columns is irrelevant for GIN indexes. The manual:

A multicolumn GIN index can be used with query conditions that involve any subset of the index's columns. Unlike B-tree or GiST, index search effectiveness is the same regardless of which index column(s) the query conditions use.

Nearest neighbour search

Since you are including a PostGis geometry type, chances are you want to do a nearest neighbour search, for which you need a GiST index. In this case I suggest two indexes:

CREATE INDEX tbl_ac_gist_idx ON tbl USING GiST(a, c);  -- geometry type
CREATE INDEX tbl_bc_gin_idx  ON tbl USING GIN(b, c);

You could add the integer column c to either or both. It depends. For that, you need either btree_gin or btree_gist or both, respectively.

like image 166
Erwin Brandstetter Avatar answered Oct 04 '22 02:10

Erwin Brandstetter


the 3 fields cannot go under the same index because of their nature

Oh yes they can.

like image 43
Richard Huxton Avatar answered Oct 04 '22 02:10

Richard Huxton