Advantage of B+ trees over BSTs?

Tags:

I'm learning about B+ trees in a class about databases and I was wondering what concrete advantages B+ trees would give over Binary Search Trees?

It seems like they both have O(logN) average complexity for most operations of note but B+ trees also have an additional (negligible?) search time at each child node where BSTs obviously only take O(1) time to figure out which child node to advance to.

What real-world advantages make B+ trees more popular in databases than BSTs?

566

asked Mar 18 '13 19:03

riggspc

1 Answers

The major advantage of the B+ tree (and B-trees in general) over binary search trees is that they play well with caches. If you have a binary search tree whose nodes are stored in more or less random order in memory, then each time you follow a pointer, the machine will have to pull in a new block of memory into the processor cache, which is dramatically slower than accessing memory already in cache.

The B+-tree and the B-tree work by having each node store a huge number of keys or values and have a large number of children. They are typically packed together in a way that makes it possible for a single node to fit nicely into cache (or, if stored on disk, to be pulled from the disk in a single read operation). You then have to do more work to find a key within the node or determine which child to read next, but because all memory accesses done on a single node can be done without going back to disk, the access times are very small. This means that even though in principle a BST might be better in terms of number of memory accesses, the B+-tree and the B-tree can performed better in terms of the runtime of those memory accesses.

The typical use case for a B+-tree or B-tree is in a database, where there is a huge amount of information and the data are so numerous that they can't all fit into main memory. Accordingly, the data can then be stored in a B+-tree or B-tree on a hard disk somewhere. This minimizes the number of disk reads necessary to pull in the data during lookups. Some filesystems (like ext4, I believe) use B-trees as well for the same reason - they minimize the number of disk lookups necessary, which is the real bottleneck.

Hope this helps!

answered Oct 18 '22 18:10

templatetypedef

Related questions
                            
                                What is the difference between MSSQL and TSQL?
                            
                                Change the Background Color of Entire Column of WPF DataGrid at RunTime
                            
                                Search vim help for subject under cursor
                            
                                Java 7 String - substring complexity
                            
                                Node.js Object object has no method 'hasOwnProperty'
                            
                                Logical operators (AND, OR) with NA, TRUE and FALSE
                            
                                If `this` is not const, why can't I modify it?
                            
                                Check if IndexedDB database exists
                            
                                AssertionFailedError: <class> has no public constructor
                            
                                Handling Select2 with Selenium webdriver
                            
                                Dagger: Inject @Named strings?
                            
                                Bash associative array size

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With