Does anyone know a situation where a PostgreSQL HASH should be used instead of a B-TREE for it seems to me that these things are a trap. They are take way more time to CREATE or maintain than a B-TREE (at least 10 times more), they also take more space (for one of my table.columns, a B-TREE takes up 240 MB, while a HASH would take 4 GB) and I seem to have understood from my googling, that they do not SELECT faster than B-TREEs; yet the HASH may have recently been optimized or google was wrong.
Anyway, I wanted you guy's opinions and experiences. If these HASHs are evil, people should know.
Thanks
Also: what about MySQL's HASHs?
Hash Index. Just like the name suggests, Hash indexes in PostgreSQL use a form of the hash table data structure. Hash table is a common data structure in many programming languages. For example, a Dict in Python, a HashMap in Java or the new Map type in JavaScript.
A hash index is an index type that is most commonly used in data management. It is typically created on a column that contains unique values, such as a primary key or email address. The main benefit of using hash indexes is their fast performance.
In this run we have changed hash to btree index in pgbench read-only tests. We can see here that the PostgreSQL hash index performs better than the btree index and the performance difference is in the range of 10% to 22%.
PostgreSQL B-Tree indexes are multi-level tree structures, where each level of the tree can be used as a doubly-linked list of pages. A single metapage is stored in a fixed position at the start of the first segment file of the index. All other pages are either leaf pages or internal pages.
Hashes are faster than B-Trees for cases where you have a known key value, especially a known unique value.
Hashes should be used if the column in question is never intended to be scanned comparatively with <
or >
commands.
Hashes are O(1)
complexity, B-Trees are O(log n)
complexity ( iirc ) , ergo, for large tables with unique entries, fetching an ITEM="foo"
, they will be the most efficient way of looking it up.
This is especially practical when these unique fields are used on a join condition.
It's better to use a Hash index for text columns that are searched using = operator only. For example a URL column which needs to be indexed for lookups.
A Hash index is approximately 30% the size of a B-Tree index for something like a URL.
The reduced size allows PostgreSQL to use it's cache memory (Aka, shared_buffers) more efficiently.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With