Why is a B-Tree the preferred structure for on-disk storage.
What quality makes it preferrable over a binary tree for secondary storage.
Is that specific 'quality' a feature of the alogrithm itself;or the way in which it is implemented?
Any reference or pointer would be much appreciated.
B+ Tree are used to store the large amount of data which can not be stored in the main memory. Due to the fact that, size of main memory is always limited, the internal nodes (keys to access records) of the B+ tree are stored in the main memory whereas, leaf nodes are stored in the secondary memory.
B-Tree structure is designed specifically to avoid disk seeks as much as possible. Therefore B-Tree packs much more keys/pointers into a single node than a binary tree. This property makes the tree very flat. Usually most B-Trees are only 3 or 4 levels deep and the root node can be easily cached.
A B-tree is a tree data structure that keeps data sorted and allows searches, insertions, and deletions in logarithmic amortized time. Unlike self-balancing binary search trees, it is optimized for systems that read and write large blocks of data. It is most commonly used in database and file systems.
Disk seeks are expensive. B-Tree structure is designed specifically to avoid disk seeks as much as possible. Therefore B-Tree packs much more keys/pointers into a single node than a binary tree. This property makes the tree very flat. Usually most B-Trees are only 3 or 4 levels deep and the root node can be easily cached. This requires only 2-3 seeks to find anything in the tree. Leaves are also "packed" this way, so iterating a tree (e.g. full scan or range scan) is very efficient, because you read hundreds/thousands data-rows per single block (seek).
In binary tree of the same capacity, you'd have several tens of levels and sequential visiting every single value would require at least one seek.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With