Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where the B+ tree indexes are stored exaclty?

I am watching this video on B+ tree basics, he mentioned that the B+ tree leaves are stored on disk except the root which is stored in main memory. My professor in class mentioned that the indexes are stored in main memory, and leaves contain data pointers which point to disk.

See the below diagram:

enter image description here

My question is where exactly all the indexes are stored?

  • Do we store just the root in main memory?
  • All the indexes are stored in main memory?

I have gone through couple of links which nobody has clearly mentioned this part? Could anyone clarify my issue. Thank you

like image 242
python Avatar asked Mar 13 '23 22:03

python


2 Answers

What you have been told is garbage in many respects.

For B-tree indexes to work properly there is no predesigned scheme where what needs to be stored. An RDBMS can freely decide where to store what kinds of data.

Most RDBMSes that I know of treat all B-tree pages and all non-tree data pages uniformly. There is an in-memory cache for fast access but all data is stored on disk, too. There is no reason to make a distinction between trees and non-trees. Also, there is no distinction between root, inner and leaf pages when it comes to storing that data.

But this does not have to be so and different choices can be made.

It certainly is false that the tree root would be treated specially. It needs to be just as persistent as the index contents so it must exist on disk. Storing it just in memory would loose the index in case the database shuts down.

Some database systems support in-memory indexes that are rebuilt from persistent data when the database starts up. That is a valid design choice as well. But this is a choice, it does not have to be so.

like image 108
usr Avatar answered Mar 24 '23 15:03

usr


This depends on the detailed implementation of the database server. With SQL Server (which you've tagged this as being the server) the root may always be in memory but it will also be on disc - it has to be as SQL Server can be stopped and restarted - and the actual machine can be rebooted - without loosing the root page.

I don't believe that MS make any promises about how this sort of stuff is implemented. The scheme presented sounds sensible but cannot be guaranteed to be what happens. For instance, I would imagine the root page would NOT be in memory until the index is actually used (why waste resources which haven't been asked for yet?); once it has been used it may be held in memory, but if not used for a time if the server starts running out of memory it will probably be flushed to disc. In any case, any changes to the root page will be flushed to disc as part of the transaction which required the change.

like image 33
simon at rcl Avatar answered Mar 24 '23 16:03

simon at rcl