Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can a B-tree node be represented?

Tags:

c#

.net

b-tree

We're learning B-trees in class and have been asked to implement them in code. The teacher has left choice of programming language to us and I want to try and do it in C#. My problem is that the following structure is illegal in C#,

unsafe struct BtreeNode
        {
            int key_num;        // The number of keys in a node
            int[] key;          // Array of keys
            bool leaf;          // Is it a leaf node or not?
            BtreeNode*[] c;     // Pointers to next nodes
        }

Specifically, one is not allowed to create a pointer to point to the structure itself. Is there some work-around or alternate approach I could use? I'm fairly certain that there MUST be a way to do this within the managed code, but I can't figure it out.

EDIT: Eric's answer pointed me in the right direction. Here's what I ended up using,

class BtreeNode
{
        public List<BtreeNode> children;       // The child nodes
        public static int MinDeg;               // The Minimum Degree of the tree
        public bool IsLeaf { get; set; }        // Is the current node a leaf or not?
        public List<int> key;                   // The list of keys 
...
}
like image 452
chronodekar Avatar asked Feb 03 '12 17:02

chronodekar


People also ask

What would a node on a tree represent?

Nodes are the points at the ends of branches which represent sequences or hypothetical sequences at various points in evolutionary history.

What is a node in B-tree?

Every node in a B-Tree contains at most m children. Every node in a B-Tree except the root node and the leaf node contain at least m/2 children. The root nodes must have at least 2 nodes. All leaf nodes must be at the same level.

How many nodes can B-tree have?

The final tree can have at most n−1 nodes when n≥2. Unless n=1 there cannot ever be n nodes since we only ever insert a key into a non-empty node, so there will always be at least one node with 2 keys. Next observe that we will never have more than one key in a node which is not a right spine of our B-tree.

How many keys can each node have in a B-tree?

Each node can have at most m−1 keys. be the minimum number of children an internal (non-root) node must have.


1 Answers

Coincidentally I actually just did implement a btree in C#, for a personal project. It was fun. I built a btree of lexicographically ordered variable size (up to 64 byte) keys which presented a number of challenges, particularly around figuring out when a page of storage was too full or too empty.

My advice, having just done that, is to build an abstraction layer that captures just the btree algorithms in their most abstract form, as an abstract base class. Once I got all the btree rules captured in that form, I specialized the base class in several different ways: as a regular fixed-key-size 2-3 btree, as one of my fancy variable-size-key btrees, and so on.

To start with, under no circumstances should you be doing this with pointers. Unsafe code is seldom necessary and never easy. Only the most advanced C# programmers should be turning off the safety system; when you do that, you are taking responsibility for the type and memory safety of the program. If you're not willing to do that, leave the safety system turned on.

Second, there's no reason to make this a struct. Structs are copied by value in C#; a btree node is not a value.

Third, you don't need to keep the number of keys in a node; the array of keys knows how many keys are in it.

Fourth, I would use a List<T> rather than an array; they are more flexible.

Fifth, you need to decide whether the key lives in the node or in the parent. Either way can work; my preference is for the key to live in the node, because I see the key as being associated with the node.

Sixth, it is helpful to know whether a btree node is the root or not; you might consider having two bools, one "is this a leaf?" and one "is this the root?" Of course a btree with a single item in it has a single node that is both leaf and root.

Seventh, you are probably going to build this thing to be mutable; normally one does not make public mutable fields on a C# class. You might consider making them properties. Also, the list of children can be grown and shrunk, but its identity does not change, so make it referentially read-only:

So I would probably structure my basic node as:

class Node
{
    public int Key { get; set; }
    public bool IsRoot { get; set; }
    public bool IsLeaf { get; set; }
    private List<Node> children = new List<Node>();
    public List<Node> Children { get { return this.children; } }
}

Make sense?

like image 79
Eric Lippert Avatar answered Oct 12 '22 06:10

Eric Lippert