Following on from my previous question on this topic, Postgres combining multiple Indexes:
I have the following table on Postgres 9.2 (with postgis):
CREATE TABLE updates (
update_id character varying(50) NOT NULL,
coords geography(Point,4326) NOT NULL,
user_id character varying(50) NOT NULL,
created_at timestamp without time zone NOT NULL
);
And I am running following query on the table:
select *
from updates
where ST_DWithin(coords, ST_MakePoint(-126.4, 45.32)::geography, 30000)
and user_id='3212312'
order by created_at desc
limit 60
So given that, what Index should I use for (coords + user_id), GIST or BTree?
CREATE INDEX ix_coords_user_id ON updates USING GIST (coords, user_id);
OR
CREATE INDEX ix_coords_user_id ON updates (coords, user_id);
I was reading that BTree performs better than GIST, but am I forced to use GIST since I am using postgis geography field??
A GiST index is lossy, meaning that the index might produce false matches, and it is necessary to check the actual table row to eliminate such false matches. (PostgreSQL does this automatically when needed.) GiST indexes are lossy because each document is represented in the index by a fixed-length signature.
In Postgres, a B-Tree index is what you most commonly want Using an index is much faster than a sequential scan because it may only have to read a few pages as opposed to sequentially scanning thousands of them (when you're returning only a few records).
PostgreSQL B-Tree indexes are multi-level tree structures, where each level of the tree can be used as a doubly-linked list of pages. A single metapage is stored in a fixed position at the start of the first segment file of the index. All other pages are either leaf pages or internal pages.
PostgreSQL's use of B+ treesNormal ("btree" type) indexes in Postgres are not B+ trees. The distinction between B+ trees and B-trees is kind of nonsense for database indexes in the first place -- all the columns in the index itself *are* the lookup key, and they're the same on the leaf level as any other level.
You must use GiST if you want to use any index method other than the regular b-tree indexes (or hash indexes, but they shouldn't really be used). PostGIS indexes require GiST.
B-tree indexes can only be used for basic operations involving equality or ordering, like =
, <
, <=
, >
, >=
, <>
, BETWEEN
and IN
. While you can create a b-tree index on a geomtery object (point, region, etc) it can only actually be used for equality as ordering comparisons like >
are generally meaningless for such objects. A GiST index is required to support more complex and general comparisons like "contains", "intersects", etc.
You can use the btree_gist
extension to enable b-tree indexing for GiST. It's considerably slower than regular b-tree indexes, but allows you to create a multi-column index that contains both GiST-only types and regular types like text
, integer
, etc.
In these situations you really need to use explain analyze
(explain.depesz.com is useful for this) to examine how Pg uses various indexes and combinations of indexes that you create. Try different column orderings in multi-column indexes, and see whether two or more separate indexes are more effective.
I strongly suspect that you'll get the best results with the multicolumn GiST index in this case, but I'd try several different combinations of indexes and index column orderings to see.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With