Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Disadvantages of creating multiple indexes in a PostgreSQL table

After reading documentation on indexes I thought

Hey, since (in my case) almost always reading from database is performed much more often than writing to it, why not create indexes on most fields in the table?

Is it right attitude? Are there any other downsides of doing this, except longer inserting?

Of course, indexes would be limited to fields that I actually use in conditions of SELECT statements.

like image 338
dev9 Avatar asked Dec 20 '13 15:12

dev9


2 Answers

Indexes have several disadvantages.
First, they consume space. This may be inconsequential, but if your table is particularly large, it may have an impact.
Second, and more importantly, you need to remember that indexes have a performance penalty when it comes to INSERTing new rows, DELETEing old ones or UPDATEing existing values of the indexed column, as now the DML statement need not only modify the table's data, but the index's one too. Once again, this depends largely on your application usecase. If DMLs are so rare that the performance is a non-issue, this may not be a consideration.
Third (although this ties in storngly to my first point), remember that each time you create another database object, you are creating an additional maintenance overhead - it's another index you'd have to occasionally rebuild, collect statistics for (depending on the RDBMS you're using, of course), another objet to clatter the data dictionary, etc.

The bottom line all comes down to your usecase. If you have important queries that you run often and that can be improved by this index - go for it. If you're running this query once in a blue moon, you probably wouldn't want to slow down all your INSERT statements.

like image 168
Mureinik Avatar answered Sep 30 '22 16:09

Mureinik


There also is a (small) performance downside of having many indexes during read operations: the more you have, the more time Postgres spends evaluating which query plan is best.

As you correctly state, there's no point in having an index on fields that you don't use on conditions. It merely slows down inserts.

Note that there also is little point in having an index on fields that offer dubious selectivity. Picture a table that is roughly split in two depending on whether rows have a boolean set to true or false. An index on that field offers little benefit for queries if true and false values distributed mostly evenly — the index could be useful, from lack of better options, if true and false rows are clustered together. If, in contrast, that field is 90% true, a partial index on the 10% that are false is useful.

And lastly, there is the storage issue: each index takes space. Sometimes (GIN in particular) a few times more than the indexed data.

like image 42
Denis de Bernardy Avatar answered Sep 30 '22 17:09

Denis de Bernardy