Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres hstore: GIN vs GiST index performance

I have to decide whether to use GIN or GiST indexing for an hstore column.

The Postgres docs state:

  • GIN index lookups are about three times faster than GiST
  • GIN indexes take about three times longer to build than GiST
  • GIN indexes are about ten times slower to update than GiST
  • GIN indexes are two-to-three times larger than GiST

The way I interpret it, use GIN if you need to query a lot, use GiST if you need to update a lot.

In this test, all of the three disadvantages of GIN over GiST mentioned above are confirmed. However, other than suggested in the Postgres docs, the advantage of GIN over GiST (faster lookup) is very small. Slide 53 shows that in the test GIN was only 2% to 3% faster as opposed to 200% to 300% suggested in the Postgres docs.

Which source of information is more reliable and why?

like image 828
migu Avatar asked Dec 06 '13 05:12

migu


People also ask

Which index is faster in PostgreSQL?

In Postgres, a B-Tree index is what you most commonly want Using an index is much faster than a sequential scan because it may only have to read a few pages as opposed to sequentially scanning thousands of them (when you're returning only a few records). If you run a standard CREATE INDEX it creates a B-tree for you.

Which of the following is an advantage of the GiST over the gin index in PostgreSQL?

For dynamic data, GiST indexes are faster to update. Specifically, GiST indexes are very good for dynamic data and fast if the number of unique words (lexemes) is under 100,000, while GIN indexes will handle 100,000+ lexemes better but are slower to update.

Are unique indexes faster?

A unique index guarantees that the table won't have more than one row with the same value. It's advantageous to create unique indexes for two reasons: data integrity and performance. Lookups on a unique index are generally very fast.

What is gin index in PostgreSQL?

GIN stands for Generalized Inverted Index. GIN is designed for handling cases where the items to be indexed are composite values, and the queries to be handled by the index need to search for element values that appear within the composite items.


1 Answers

The documents state what the situation is "in general".

However, you aren't running PostgreSQL "in general", you are running it on specific hardware with a specific pattern of use.

So - if you care a lot, then you'll want to test it yourself. A GiST index will always require re-checking its condition. However if the queries you run end up doing further checks anyway, a GIN index might not win there. Also there are all the usual issues around cache usage etc.

For my usage, on smaller databases with moderate update rates, I've been happy enough with GiST. I've seen a 50% improvement in speed with GIN (across a whole query), but it's not been worth the slower indexing. If I was building a huge archive server it might be different.

like image 199
Richard Huxton Avatar answered Sep 30 '22 13:09

Richard Huxton