Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala: Tree Insert Tail Recursion With Complex Structure

I'm creating a tree of custom objects in scala and my insert method throws a stack overflow because it's not tail recursive. However, I can't quite figure out how to make it tail recursive. Related examples I've seen use "accumulator" variables, but they've either been things like Integers which can just be multiplied and overwritten, or lists which i'm having trouble adapting to a tree. Here's what I have:

The basis for my trees:

abstract class GeoTree
case object EmptyTree extends GeoTree
case class Node(elem:GeoNode, left:GeoTree, right:GeoTree) extends GeoTree

The insert method to recursively create the tree (method causing the stack overflow):

  def insert(t:GeoTree, v: GeoNode): GeoTree = t match {
    case EmptyTree => new Node(v, EmptyTree, EmptyTree)
    case Node(elem:GeoNode, left:GeoTree, right:GeoTree) => {
      if (v < elem) new Node(elem, insert(left, v), right)
      else new Node(elem, left, insert(right, v))
    }
  }

I don't think the code for the GeoNode is actually particularly relevant because it's very simple. The class has two Long attributes and the <, >, and == operators overriden appropriately for use inside a tree. Can someone make a suggestion on how to use an accumulator for my insert function, or some other way to make it tail recursive?

like image 505
The.Anti.9 Avatar asked Jul 07 '13 10:07

The.Anti.9


1 Answers

Your function can't be tail recursive. The reason is that your recursive calls to insert don't end the computation, they're used as a subexpressions, in this case in new Node(...). For example. if you were just searching for the bottom element, it would easy to make it tail recursive.

What's happening: As you're descending the tree down, calling insert on each of the nodes, but you have to remember the way back to the root, since you have to reconstruct the tree after you replace a bottom leaf with your new value.

A possible solution: Remember the down path explicitly, not on stack. Let's use a simplified data structure for the example:

sealed trait Tree;
case object EmptyTree extends Tree;
case class Node(elem: Int, left:Tree, right:Tree) extends Tree;

Now define what a path is: It's a list of nodes together with the information if we went right or left. The root is always at the end of the list, the leaf at the start.

type Path = List[(Node, Boolean)]

Now we can make a tail recursive function that computes a path given a value:

// Find a path down the tree that leads to the leaf where `v` would belong.
private def path(tree: Tree, v: Int): Path = {
  @tailrec
  def loop(t: Tree, p: Path): Path =
    t match {
      case EmptyTree        => p
      case n@Node(w, l, r)  =>
        if (v < w) loop(l, (n, false) :: p)
        else       loop(r, (n, true)  :: p)
    }

  loop(tree, Nil)
}

and a function that takes a path and a value and reconstructs a new tree with the value as a new node at the bottom of the path:

// Given a path reconstruct a new tree with `v` inserted at the bottom
// of the path.
private def rebuild(path: Path, v: Int): Tree = {
  @tailrec
  def loop(p: Path, subtree: Tree): Tree =
    p match {
      case Nil                          => subtree
      case (Node(w, l, r), false) :: q  => loop(q, Node(w, subtree, r))
      case (Node(w, l, r), true)  :: q  => loop(q, Node(w, l, subtree))
    }
  loop(path, Node(v, EmptyTree, EmptyTree))
}

Inserting is then easy:

def insert(tree: Tree, v: Int): Tree =
  rebuild(path(tree, v), v)

Note that this version isn't particularly efficient. Probably you could make it more efficient using Seq, or even further by using a mutable stack to store the path. But with List the idea can be expressed nicely.

Disclaimer: I only compiled the code, I haven't tested it at all.

Notes:

  • This is a special example of using zippers to traverse a functional tree. With zippers you can apply the same idea on any kind of tree and walk the tree in various directions.
  • If you want the tree to be useful for larger sets of elements (which you probably do, if you're getting stack overflows), I'd strongly recommend making it self-balanced. That is, update the structure of the tree so that it's depth is always O(log n). Then your operations will be fast in all cases, you won't have to worry about stack overflows and you can use your stack-based, non-tail-recursive version. Probably the easiest variant of a self-balancing tree is the AA tree.
like image 174
Petr Avatar answered Oct 14 '22 11:10

Petr