Berkeleydb - B-Tree versus Hash Table

Question

I am trying to understand what should drive the choice of the access method while using a BerkeleyDB : B-Tree versus HashTable. A Hashtable provides O(1) lookup but inserts are expensive (using Linear/Extensible hashing we get amortized O(1) for insert). But B-Trees provide log N (base B) lookup and insert times. A B-Tree can also support range queries and allow access in sorted order.

Apart from these considerations what else should be factored in?
If I don't need to support range queries can I just use a Hashtable access method?

hyc · Accepted Answer

When your data sets get very large, B-trees are still better because the majority of the internal metadata may still fit in cache. Hashes, by their nature (uniform random distribution of data) are inherently cache-unfriendly. I.e., once the total size of the data set exceeds the working memory size, hash performance drops off a cliff while B-tree performance degrades gracefully (logarithmically, actually).

Berkeleydb - B-Tree versus Hash Table

Tags:

hashtable

b-tree

berkeley-db

rakeshr

1 Answers

hyc

Recent Activity

Donate For Us

Berkeleydb - B-Tree versus Hash Table

Tags:

hashtable

b-tree

berkeley-db

rakeshr

1 Answers

hyc

Related questions

Recent Activity

Donate For Us