Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to walk binary abstract syntax tree to generate infix notation with minimally correct parentheses

I am being passed a binary AST representing a math formula. Each internal node is an operator and leaf nodes are the operands. I need to walk the tree and output the formula in infix notation. This is pretty easy to do by walking the tree with a recursive algorithm such as the Print() method shows below. The problem with the Print() method is that the order of operations is lost when converting to infix because no parentheses are generated.

I wrote the PrintWithParens() method which outputs a correct infix formula, however it adds extraneous parentheses. You can see in three of the four cases of my main method it adds parenthesis when none are necessary.

I have been racking my brain trying to figure out what the correct algorithm for PrintWithMinimalParens() should be. I'm sure there must be an algorithm that can output only parentheses when necessary to group terms, however I have been unable to implement it correctly. I think I must need to look at the precedence of the operators in the tree below the current node, but the algorithm I have there now does't work (see the last 2 cases in my main method. No parentheses are needed, but my logic adds them).

public class Test {

static abstract class Node {

    Node left;
    Node right;
    String text;

    abstract void Print();
    abstract void PrintWithParens();
    abstract void PrintWithMinimalParens();

    int precedence()
    {
        return 0;
    }
}

enum Operator { 
    PLUS(1,"+"), 
    MINUS(1, "-"), 
    MULTIPLY(2, "*"), 
    DIVIDE(2, "/"), 
    POW(3, "^") 
    ;

    private final int precedence;
    private final String text;

    private Operator(int precedence, String text)
    {
        this.precedence = precedence;
        this.text = text;
    }

    @Override
    public String toString() {
        return text;
    }

    public int getPrecedence() {
        return precedence;
    }
}

static class OperatorNode extends Node {

    private final Operator op;

    OperatorNode(Operator op)
    {
        this.op = op;
    }

    @Override
    void Print() {
        left.Print();
        System.out.print(op);
        right.Print();
    }

    @Override
    void PrintWithParens() {
        System.out.print("(");
        left.PrintWithParens();
        System.out.print(op);
        right.PrintWithParens();
        System.out.print(")");
    }

    @Override
    void PrintWithMinimalParens() {

        boolean needParens = 
                (left.precedence() != 0 && left.precedence() < this.op.precedence) 
                || 
                (right.precedence() != 0 && right.precedence() < this.op.precedence);

        if(needParens)
            System.out.print("(");

        left.PrintWithMinimalParens();
        System.out.print(op);
        right.PrintWithMinimalParens();

        if(needParens)
            System.out.print(")");

    }

    @Override
    int precedence() {
        return op.getPrecedence();
    }

}

static class TextNode extends Node {

    TextNode(String text)
    {
        this.text = text;
    }

    @Override
    void Print() {
        System.out.print(text);
    }

    @Override
    void PrintWithParens() {
        System.out.print(text);
    }

    @Override
    void PrintWithMinimalParens() {
        System.out.print(text);
    }

}

private static void printExpressions(Node rootNode) {

    System.out.print("Print() : ");
    rootNode.Print();
    System.out.println();
    System.out.print("PrintWithParens() : ");
    rootNode.PrintWithParens();
    System.out.println();
    System.out.print("PrintWithMinimalParens() : ");
    rootNode.PrintWithMinimalParens();
    System.out.println();
    System.out.println();

}

public static void main(String[] args) 
{
    System.out.println("Desired:  1+2+3+4");
    Node rootNode = new OperatorNode(Operator.PLUS);
    rootNode.left = new TextNode("1");
    rootNode.right = new OperatorNode(Operator.PLUS);
    rootNode.right.left = new TextNode("2");
    rootNode.right.right = new OperatorNode(Operator.PLUS);
    rootNode.right.right.left = new TextNode("3");
    rootNode.right.right.right = new TextNode("4");

    printExpressions(rootNode);

    System.out.println("Desired: 1+2*3+4");
    rootNode = new OperatorNode(Operator.PLUS);
    rootNode.left = new TextNode("1");
    rootNode.right = new OperatorNode(Operator.PLUS);
    rootNode.right.left = new OperatorNode(Operator.MULTIPLY);
    rootNode.right.left.left = new TextNode("2");
    rootNode.right.left.right = new TextNode("3");
    rootNode.right.right = new TextNode("4");

    printExpressions(rootNode);

    System.out.println("Desired: 1+2*(3+4)");
    rootNode = new OperatorNode(Operator.PLUS);
    rootNode.left = new TextNode("1");
    rootNode.right = new OperatorNode(Operator.MULTIPLY);
    rootNode.right.left = new TextNode("2");
    rootNode.right.right = new OperatorNode(Operator.PLUS);
    rootNode.right.right.left = new TextNode("3");
    rootNode.right.right.right = new TextNode("4");

    printExpressions(rootNode);

    System.out.println("Desired: 1+2^8*3+4");
    rootNode = new OperatorNode(Operator.PLUS);
    rootNode.left = new TextNode("1");
    rootNode.right = new OperatorNode(Operator.MULTIPLY);
    rootNode.right.left = new OperatorNode(Operator.POW);
    rootNode.right.left.left = new TextNode("2");
    rootNode.right.left.right = new TextNode("8");
    rootNode.right.right = new OperatorNode(Operator.PLUS);
    rootNode.right.right.left = new TextNode("3");
    rootNode.right.right.right = new TextNode("4");

    printExpressions(rootNode);
    }
}

Output:

Desired:  1+2+3+4
Print() : 1+2+3+4
PrintWithParens() : (1+(2+(3+4)))
PrintWithMinimalParens() : 1+2+3+4

Desired: 1+2*3+4
Print() : 1+2*3+4
PrintWithParens() : (1+((2*3)+4))
PrintWithMinimalParens() : 1+2*3+4

Desired: 1+2*(3+4)
Print() : 1+2*3+4
PrintWithParens() : (1+(2*(3+4)))
PrintWithMinimalParens() : 1+(2*3+4)

Desired: 1+2^8*3+4
Print() : 1+2^8*3+4
PrintWithParens() : (1+((2^8)*(3+4)))
PrintWithMinimalParens() : 1+(2^8*3+4)

Is is possible to implement the PrintWithMinimalParens() that I want? Does the fact that order is implicit in the tree make doing what I want impossible?

like image 255
Justin Avatar asked Jan 05 '13 18:01

Justin


People also ask

How do you create an Abstract Syntax Tree?

The Abstract Syntax Tree is generated using both the list of tokens (from the lexical analysis) and the source code. The AST is generated during the syntax analysis stage of the compilation. Any syntax error would be detected and a syntax error message would then be returned, stopping the compilation process.

How do you create an Abstract Syntax Tree in Python?

The abstract syntax itself might change with each Python release; this module helps to find out programmatically what the current grammar looks like. An abstract syntax tree can be generated by passing ast. PyCF_ONLY_AST as a flag to the compile() built-in function, or using the parse() helper provided in this module.

Are abstract syntax trees binary?

Binary trees have many uses. In fact, compilers often use them to build what is known as an abstract syntax tree (or AST) - an intermediate representation of the code that is not yet fully compiled.


1 Answers

In your code you are comparing each operator with its children to see if you need parentheses around it. But you should actually be comparing it with its parent. Here are some rules that can determine if parentheses can be omitted:

  1. You never need parentheses around the operator at the root of the AST.
  2. If operator A is the child of operator B, and A has a higher precedence than B, the parentheses around A can be omitted.
  3. If a left-associative operator A is the left child of a left-associative operator B with the same precedence, the parentheses around A can be omitted. A left-associative operator is one for which x A y A z is parsed as (x A y) A z.
  4. If a right-associative operator A is the right child of a right-associative operator B with the same precedence, the parentheses around A can be omitted. A right-associative operator is one for which x A y A z is parsed as x A (y A z).
  5. If you can assume that an operator A is associative, i.e. that (x A y) A z = x A (y A z) for all x,y,z, and A is the child of the same operator A, you can choose to omit parentheses around the child A. In this case, reparsing the expression will yield a different AST that gives the same result when evaluated.

Note that for your first example, the desired result is only correct if you can assume that + is associative (which is true when dealing with normal numbers) and implement rule #5. This is because your input tree is built in a right-associative fashion, while operator + is normally left-associative.

like image 88
interjay Avatar answered Nov 11 '22 17:11

interjay