Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why DB indexes use balanced trees, not hashtables?

Tags:

Hashtables seem to be preferable in terms of disk access. What is the real reason that indexes usually implemented with a tree? Sorry if it's infantile, but i did not find the straight answer on SO.

like image 269
rudnev Avatar asked Oct 28 '09 10:10

rudnev


People also ask

What is the difference in the efficiency of Hashtables and B trees?

The difference between using a b-tree and a hash table is that the former allows you to use column comparisons in expressions that use the =, >, >=, <, <=, or BETWEEN operators, while the latter is used only for equality comparisons that use the = or <=> operators.

Are database indexes hash tables?

Hash indexes are an integral part of databases. If you've ever used a database, chances are that you have seen them in action without even realizing it. Hash indexes differ in work from other types of indexes because they store values rather than pointers to records located on a disk.

Do databases use hash?

Databases (indexing)Probably the most common use for hash tables is databases. To retrieve and increment through data in a database in a performant way, we need some way to identify it uniquely, and we need some way to index it.

How are indexes implemented in DB?

Indexes are created using a few database columns. The first column is the Search key that contains a copy of the primary key or candidate key of the table. These values are stored in sorted order so that the corresponding data can be accessed quickly. Note: The data may or may not be stored in sorted order.


2 Answers

One of the common actions with data is to sort it or to search for data in a range - a tree will contain data in order while a hash table is only useful for looking up a row and has no idea of what the next row is.

So hash tables are no good for this common case, thanks to this answer

SELECT * FROM MyTable WHERE Val BETWEEN 10000 AND 12000 

or

SELECT * FROM MyTable ORDER BY x 

Obviously there are cases where hash tables are better but best to deal with the main cases first.

like image 105
mmmmmm Avatar answered Oct 13 '22 08:10

mmmmmm


Size, btrees start small and perfectly formed and grow nicely to enormous sizes. Hashes have a fixed size which can be too big (10,000 buckets for 1000 entries) or too small (10,000 buckets for 1,000,000,000 entries) for the amount of data you have.

like image 28
James Anderson Avatar answered Oct 13 '22 08:10

James Anderson