Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why postgresql "disrespects" hash indices?

Tags:

I know postgresql discourages using hash indices. They actually say:

"Caution Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash. They are also not replicated over streaming or file-based replication. For these reasons, hash index use is presently discouraged."

This is the good argument not to use them at all, but I can't understand why postgresql developers don't do effort to make hash indices first class citizens and to encourage their usage in certain situations rather than discourage to do it at all.

Actually if you only need to search for equality, hash indices should be far superior than any kind of trees, since they do search, insertion and deletion in o(1), and balanced trees naturally can't be better than o(log(n)). In worst case hash indices could work for o(n), but there is a bunch of known techniques to avoid worst case. If I were a db engine architect, such an argument should definitely rule my decision to make hash indices a viable alternative, but with postgresql it seems different. Is there a technical reason for this, or such decision is not technically motivated?

like image 452
Kivan Avatar asked Jul 03 '16 08:07

Kivan


1 Answers

Tree indexes, by using for instance B+-trees and their variants, are so efficient that they are considered having costs of O(c), where c, the height of the tree, is a small constant (with c = 3 or 4 you can index millions of records), and usually at least one or two levels of such trees are cached, so that the number of disk accesses can be equal to 1 or 2 in most cases.

So, for practical purposes, they have performances similar to those of hash indexes, and, moreover, have the enormous advantage of allowing range searches.

like image 50
Renzo Avatar answered Sep 28 '22 03:09

Renzo