From what information I could find, they both solve the same problems - more esoteric operations like array containment and intersection (&&, @>, <@, etc). However I would be interested in advice about when to use one or the other (or neither possibly).
The PostgreSQL documentation has some information about this:
However I would be particularly interested to know if there is a performance impact when the memory to index size ration starts getting small (ie. the index size becomes much bigger than the available memory)? I've been told on the #postgresql IRC channel that GIN needs to keep all the index in memory, otherwise it won't be effective, because, unlike B-Tree, it doesn't know which part to read in from disk for a particular query? The question would be: is this true (because I've also been told the opposite of this)? Does GiST have the same restrictions? Are there other restrictions I should be aware of while using one of these indexing algorithms?
B-tree indexes B-tree is the default index in Postgres and is best used for specific value searches, scanning ranges, data sorting or pattern matching.
GiST stands for Generalized Search Tree. It is a balanced, tree-structured access method, that acts as a base template in which to implement arbitrary indexing schemes.
GIN stands for Generalized Inverted Index. GIN is designed for handling cases where the items to be indexed are composite values, and the queries to be handled by the index need to search for element values that appear within the composite items.
A GiST index is lossy, meaning that the index might produce false matches, and it is necessary to check the actual table row to eliminate such false matches. (PostgreSQL does this automatically when needed.) GiST indexes are lossy because each document is represented in the index by a fixed-length signature.
First of all, do you need to use them for text search indexing? GIN and GiST are index specialized for some data types. If you need to index simple char or integer values then the normal B-Tree index is the best.
Anyway, PostgreSQL documentation has a chapter on GIST and one on GIN, where you can find more info.
And, last but not least, the best way to find which is best is to generate sample data (as much as you need to be a real scenario) and then create a GIST index, measuring how much time is needed to create the index, insert a new value, execute a sample query. Then drop the index and do the same with a GIN index. Compare the values and you will have the answer you need, based on your data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With