Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

recursive datatypes in haskell

Consider the following definition taken from a tutorial at http://www.haskell.org :

data Tree a                = Leaf a | Branch (Tree a) (Tree a) 
fringe                     :: Tree a -> [a]
fringe (Leaf x)            =  [x]
fringe (Branch left right) =  fringe left ++ fringe right

I am unclear about what happens at runtime when the function fringe is executing.

When compiling the expression fringe left , (1) does the compiler somehow already know if the left tree is a Branch or a Leaf - i.e. it only operates on statically known trees - or (2) does it emit some if/switch like conditions to check if the left tree is a Leaf or a Branch

If it is the later i.e. (2), then, why is this supposed to be more typesafe than the equivalent C function which basically would look just like above except that there is only one type floating around (pointer to a node).

like image 291
MK. Avatar asked Apr 10 '17 01:04

MK.


1 Answers

This:

fringe (Leaf x)            =  [x]
fringe (Branch left right) =  fringe left ++ fringe right

is exactly equivalent to a function of a single variable that then immediately does a pattern match:

fringe t = case t of
    Leaf x            -> [x]
    Branch left right -> fringe left ++ fringe right

So this answers your first question as (2): "it emit[s] some case-like conditions to check if the left tree is a Leaf or a Branch".

As for why it's safer than what you would do in C, well, what would you do in C?

Usually you end up storing a product of a tag which shows if something is Leaf or Branch, and a payload which is the untagged union of a and (Tree a, Tree a). Then, the code you write is along the lines of the following (which is probably not 100% legal C but should get the point across):

enum TreeTag { Tree_Leaf; Tree_Branch; };

struct Tree {
  TreeTag tag;
  union payload {
    struct { 
      int x;    // yeah let's not touch on parametric polymorphism here...
    } Leaf;
    struct {
      Tree l;
      Tree r;
    } Branch;
  };
};


switch (t.tag) {
  case Tree_Leaf: ... use t.payload.Leaf.x here
  case Tree_Branch: ... use t.payload.Branch.left and t.payload.Branch.right here
}

The problem is that there is nothing statically preventing you from accidentally using t.payload.Branch.left in the Tree_Leaf case and so on. Also, there's nothing statically preventing you to do something like

t.tag = Tree_Branch;
t.payload.Leaf.x = 42;

which would lead to an invalid value of "type" Tree.

like image 116
Cactus Avatar answered Oct 20 '22 10:10

Cactus