I am implementing a table that has a column with a data type of <code>tsvector</code> and I am trying to understand what index would be better to use? GIN or GiST? In looking through the postgres documentation here I seem to get that: <ul> <li>GiST is faster to update and build the index and less accurate than gin.</li> <li>GIN is slower to update and build the index but is more accurate.</li> </ul> OK, so why would anybody want a gist indexed field over gin? If gist could give you the wrong results? There must be some advantage (outside performance) on this. Can anybody explain in layman's terms when I would want to use GIN vs. GiST?

I don't think I could explain it better than the manual already does: <blockquote> In choosing which index type to use, GiST or GIN, consider these performance differences: <ul> <li> GIN index lookups are about three times faster than GiST </li> <li> GIN indexes take about three times longer to build than GiST </li> <li> GIN indexes are moderately slower to update than GiST indexes, but about 10 times slower if fast-update support was disabled [...] </li> <li> GIN indexes are two-to-three times larger than GiST indexes </li> </ul> </blockquote> Link and quote refer to the manual for Postgres 9.4. Size and performance estimates seemed slightly outdated already. With Postgres 9.4 the odds have shifted substantially in favor of GIN. The release notes of Postgres 9.4 include: <blockquote> <ul> <li> Reduce GIN index size (Alexander Korotkov, Heikki Linnakangas) [...] </li> <li> Improve speed of multi-key GIN lookups (Alexander Korotkov, Heikki Linnakangas) </li> </ul> </blockquote> Size and performance estimates have since been removed from the manual. Note that there are special use cases that require one or the other. One thing you misunderstood: You never get wrong results with a GiST index. The index operates on hash values, which can lead to false positives in the index. This should only become relevant with a very big number of different words in your documents. False positives are eliminated after re-checking the actual row in any case. The manual: <blockquote> A GiST index is lossy, meaning that the index may produce false matches, and it is necessary to check the actual table row to eliminate such false matches. (PostgreSQL does this automatically when needed.) </blockquote> Bold emphasis mine.

Difference between GiST and GIN index

1 Answers

I don't think I could explain it better than the manual already does:

In choosing which index type to use, GiST or GIN, consider these performance differences:

GIN index lookups are about three times faster than GiST

GIN indexes take about three times longer to build than GiST

GIN indexes are moderately slower to update than GiST indexes, but about 10 times slower if fast-update support was disabled [...]

GIN indexes are two-to-three times larger than GiST indexes

Link and quote refer to the manual for Postgres 9.4. Size and performance estimates seemed slightly outdated already. With Postgres 9.4 the odds have shifted substantially in favor of GIN.
The release notes of Postgres 9.4 include:

Reduce GIN index size (Alexander Korotkov, Heikki Linnakangas) [...]

Improve speed of multi-key GIN lookups (Alexander Korotkov, Heikki Linnakangas)

Size and performance estimates have since been removed from the manual.

Note that there are special use cases that require one or the other.

One thing you misunderstood: You never get wrong results with a GiST index. The index operates on hash values, which can lead to false positives in the index. This should only become relevant with a very big number of different words in your documents. False positives are eliminated after re-checking the actual row in any case. The manual:

A GiST index is lossy, meaning that the index may produce false matches, and it is necessary to check the actual table row to eliminate such false matches. (PostgreSQL does this automatically when needed.)

Bold emphasis mine.

109

answered Sep 26 '22 02:09

Erwin Brandstetter

Related questions
                            
                                What's a PostgreSQL "Cluster" and how do I create one?
                            
                                How to invoke sequence while inserting new record into postgresql table?
                            
                                Different db for testing in Django?
                            
                                Oracle SQL Developer and PostgreSQL
                            
                                How can I specify the schema to run an sql file against in the Postgresql command line
                            
                                cannot create extension without superuser role
                            
                                missing FROM-clause entry for table [closed]
                            
                                How to add a new Column in a table after the 2nd or 3rd column in the Table using postgres?
                            
                                What is the difference between single quotes and double quotes in PostgreSQL?
                            
                                Postgres - How to check for an empty array
                            
                                GIS: PostGIS/PostgreSQL vs. MySql vs. SQL Server? [closed]
                            
                                Check Postgres access for a user
                            
                                unwrap postgresql array into rows
                            
                                Postgres 9.1 vs Mysql 5.6 InnoDB?
                            
                                Get count of records affected by INSERT or UPDATE in PostgreSQL
                            
                                How to limit rows in PostgreSQL SELECT
                            
                                Copy a few of the columns of a csv file into a table
                            
                                Postgresql: How to find pg_hba.conf file using Mac OS X
                            
                                Error Installing Psycopg2 on MacOS 10.9.5
                            
                                How do I do large non-blocking updates in PostgreSQL?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Difference between GiST and GIN index

Tags:

indexing

full-text-search

postgresql

Walker Farrow

People also ask

1 Answers

Erwin Brandstetter

Recent Activity

Donate For Us