Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expression Evaluation and Tree Walking using polymorphism? (ala Steve Yegge)

Polymorphic Tree Walking, Python version

#!/usr/bin/python

class Node:
    """base class, you should not process one of these"""
    def process(self):
        raise('you should not be processing a node')

class BinaryNode(Node):
    """base class for binary nodes"""
    def __init__(self, _left, _right):
        self.left = _left
        self.right = _right
    def process(self):
        raise('you should not be processing a binarynode')

class Plus(BinaryNode):
    def process(self):
        return self.left.process() + self.right.process()

class Minus(BinaryNode):
    def process(self):
        return self.left.process() - self.right.process()

class Mul(BinaryNode):
    def process(self):
        return self.left.process() * self.right.process()

class Div(BinaryNode):
    def process(self):
        return self.left.process() / self.right.process()

class Num(Node):
    def __init__(self, _value):
        self.value = _value
    def process(self):
        return self.value

def demo(n):
    print n.process()

demo(Num(2))                                       # 2
demo(Plus(Num(2),Num(5)))                          # 2 + 3
demo(Plus(Mul(Num(2),Num(3)),Div(Num(10),Num(5)))) # (2 * 3) + (10 / 2)

The tests are just building up the binary trees by using constructors.

program structure:

abstract base class: Node

  • all Nodes inherit from this class

abstract base class: BinaryNode

  • all binary operators inherit from this class
  • process method does the work of evaluting the expression and returning the result

binary operator classes: Plus,Minus,Mul,Div

  • two child nodes, one each for left side and right side subexpressions

number class: Num

  • holds a leaf-node numeric value, e.g. 17 or 42

The problem, I think, is that we need to parse perentheses, and yet they are not a binary operator? Should we take (2) as a single token, that evaluates to 2?

The parens don't need to show up in the expression tree, but they do affect its shape. E.g., the tree for (1+2)+3 is different from 1+(2+3):

    +
   / \
  +   3
 / \
1   2

versus

    +
   / \
  1   +
     / \
    2   3

The parentheses are a "hint" to the parser (e.g., per superjoe30, to "recursively descend")


This gets into parsing/compiler theory, which is kind of a rabbit hole... The Dragon Book is the standard text for compiler construction, and takes this to extremes. In this particular case, you want to construct a context-free grammar for basic arithmetic, then use that grammar to parse out an abstract syntax tree. You can then iterate over the tree, reducing it from the bottom up (it's at this point you'd apply the polymorphism/function pointers/switch statement to reduce the tree).

I've found these notes to be incredibly helpful in compiler and parsing theory.


Representing the Nodes

If we want to include parentheses, we need 5 kinds of nodes:

  • the binary nodes: Add Minus Mul Div
    these have two children, a left and right side

         +
        / \
    node   node
    
  • a node to hold a value: Val
    no children nodes, just a numeric value

  • a node to keep track of the parens: Paren
    a single child node for the subexpression

        ( )
         |
        node
    

For a polymorphic solution, we need to have this kind of class relationship:

  • Node
  • BinaryNode : inherit from Node
  • Plus : inherit from Binary Node
  • Minus : inherit from Binary Node
  • Mul : inherit from Binary Node
  • Div : inherit from Binary Node
  • Value : inherit from Node
  • Paren : inherit from node

There is a virtual function for all nodes called eval(). If you call that function, it will return the value of that subexpression.


String Tokenizer + LL(1) Parser will give you an expression tree... the polymorphism way might involve an abstract Arithmetic class with an "evaluate(a,b)" function, which is overridden for each of the operators involved (Addition, Subtraction etc) to return the appropriate value, and the tree contains Integers and Arithmetic operators, which can be evaluated by a post(?)-order traversal of the tree.