Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement B+ tree in Haskell?

Tags:

haskell

b-tree

A B+ tree has the leaf nodes linked together. Viewing the pointer structure of a B+ tree as directed graph its not cyclic. But ignoring the directions of pointers and viewing it as undirected the leaf nodes linked together creates cycles in the graph.

In Haskell how could a leaf be constructed as the child of a parent internal node and simultaneously the next link from the adjacent leaf node. How could one do this with Haskell's algebraic datatypes? It seems that Haskell ADT in general make cyclic like structures difficult to express.

like image 940
user782220 Avatar asked Dec 01 '13 06:12

user782220


1 Answers

Here is an idea for (immutable / "mutable"-by-reconstruction / zipperable) ADT representation (involving immutable vectors):

module Data.BTree.Internal where

import Data.Vector

type Values v = Vector v

type Keys k = Vector k

data Leaf k v
  = Leaf
    { _leafKeys   :: !(Keys k)
    , _leafValues :: !(Values v)
    , _leafNext   :: !(Maybe (Leaf k v)) -- @Maybe@ is lazy in @Just@, so this strict mark
                                         -- is ok for tying-the-knot stuff.
    -- , _leafPrev   :: !(Maybe (Leaf k v))
    -- ^ for doubly-linked lists of leaves
    }

type Childs k v = Vector (BTree k v)

data Node k v
  = Node
    { _nodeKeys   :: !(Keys k)
    , _nodeChilds :: !(Childs k v)
    }

data BTree k v
  = BTreeNode !(Node k v)
  | BTreeLeaf !(Leaf k v)

newtype BTreeRoot k v
  = BTreeRoot (BTree k v)
  • This should be internal, so that improper usage of raw constructors, accessors or pattern-matching wouldn't break the tree.

  • Keys, Values, Childs length control can be added (with run-time checks or possibly with GADTs and such).

And for an interface:

module Data.BTree ( {- appropriate exports -} ) where

import Data.Vector
import Data.BTree.Internal

-- * Building trees: "good" constructors.

keys :: [k] -> Keys k
keys = fromList

values :: [v] -> Values v
values = fromList

leaves :: [Leaf k v] -> Childs k v
leaves = fromList . fmap BTreeLeaf

leaf :: Keys k -> Values v -> Maybe (Leaf k v) -> Leaf k v
-- or
-- leaf :: Keys k -> Values v -> Maybe (Leaf k v) -> Maybe (Leaf k v) -> Leaf k v
-- for doubly-linked lists of leaves
leaf = Leaf

node :: Keys k -> Childs k v -> BTree k v
node ks = BTreeNode . Node ks

-- ...

-- * "Good" accessors.

-- ...

-- * Basic functions: insert, lookup, etc.

-- ...

Then this kind of a tree:

B+tree example

can be built as

test :: BTree Int ByteString
test = let
  root  = node (keys [3, 5]) (leaves [leaf1, leaf2, leaf3])
  leaf1 = leaf (keys [1, 2]) (values ["d1", "d2"]) (Just leaf2)
  leaf2 = leaf (keys [3, 4]) (values ["d3", "d4"]) (Just leaf3)
  leaf3 = leaf (keys [5, 6, 7]) (values ["d5", "d6", "d7"]) Nothing
  in root

This technique known as "tying the knot". Leaves can be cycled:

  leaf1 = leaf (keys [1, 2]) (values ["d1", "d2"]) (Just leaf2)
  leaf2 = leaf (keys [3, 4]) (values ["d3", "d4"]) (Just leaf3)
  leaf3 = leaf (keys [5, 6, 7]) (values ["d5", "d6", "d7"]) (Just leaf1)

or doubly-linked (assuming _leafPrev and corresponding leaf function):

  leaf1 = leaf (keys [1, 2]) (values ["d1", "d2"]) (Just leaf2) (Just leaf3)
  leaf2 = leaf (keys [3, 4]) (values ["d3", "d4"]) (Just leaf3) (Just leaf1)
  leaf3 = leaf (keys [5, 6, 7]) (values ["d5", "d6", "d7"]) (Just leaf1) (Just leaf2)

Fully mutable representation is also possible with mutable vectors and mutable references:

type Values v = IOVector v

type Keys k = IOVector k

type Childs k v = IOVector (BTree k v)

    , _leafNext   :: !(IORef (Maybe (Leaf k v)))

and so on, basically the same, but using IORef and IOVector, working in IO monad.

like image 82
JJJ Avatar answered Oct 21 '22 04:10

JJJ