Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where does a clustered index store row data for keys on intermediate levels?

Please help me in understanding what I feel is inconsistency between two facts:

  1. SQL Server stores data in a B-Tree structure
  2. Only leaf nodes contain actual table data, while intermediate ones store only keys and pointers to children

In general, a B-Tree has the property that, for a given key in the intermediate node, all keys in the left subtree are smaller than it and in the right subtree greater, such that:

Example B-Tree

In the above example (image credit), clearly a row with the ID = 7 was inserted into the table. But where is the row data (non-key columns) for that ID if it can't be in the root node of the example and there is no 7 in the leaf nodes?

Clearly, there's more to it than "indexes are B-Trees" and I would appreciate some insight.

like image 694
Misza Avatar asked Jan 05 '23 11:01

Misza


1 Answers

That diagram is for a B-tree, but technically speaking SQL Server uses a B+tree structure. Scroll down a bit in that Wiki article and you will find

In the B+ tree, copies of the keys are stored in the internal nodes; the keys and records are stored in leaves; in addition, a leaf node may include a pointer to the next leaf node to speed sequential access (Comer 1979, p. 129).

Thus the internal nodes would only have a copy of the keys, and will be duplicated in the leaves (where, in the case of a clustered index, the actual data are held as well).

You can find more specifics here. You'll notice in the comments section a couple other folks noting that SQL Server uses a B+tree.

like image 160
John Wu Avatar answered Jan 14 '23 15:01

John Wu