Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Antlr4: Evaluate math functions f(x)

last days i was working on my grammar to be able to calculate normal expressions like: 2+2*5; 2^2 or set variables like y=5+5; etc.

Now i want to parse functions like f(a,b)=2*a*b; and then call them later like f(5,7); Im having some troubles with that.

So lets say i try to parse a function-declaration like this:

function:name=Var'(' varNames=(MultVar|Var) ')' '=' (Var|Number) (operator=('*'|'/'|'+'|'-') (Var|Number))* SEMI;

So this (kind of) works, but i think its kinda "dirty" or whatever. So im working with a listener and when im in "exitFunction" i dont really know how to handle the function best, so i can evaluate things like f(5,7); very easy.

Im having a Java-class called "Function.java" with the method "public double eval(double... args)"

So right now im having the attributes String arguments; String expression; String name; and then i need to spend a lot of time in the Listener and try to find the correct arguments etc in the String. So lot of substring(), and indexOf() etc just trying to find the name, the arguments and the expression. I save the function in a Hashmap then.

In my parser a functions call looks like:

functioncall: name=Vart '('para=(MultipleNumbers) ')' SEMI;

MultipleNumbers would be:

MultipleNumber: Number(',' Number)+;

in the Lexer. So then i try to get the arguments out of the String, and replace them in the function. Then i have a normal expression which my programm can solve again.

Since this seems to be too hard for me and its almost impossible to get all the correct "substrings" etc i think there must be a better way to implement things like this. Especially when i wanna do things like:

 f(a,b)=2*a+b; a=5; f(a,5)

its getting even harder, because i cant rly mix numbers and variables. So is there a good way to implement a "function evaluator". Maybe i can save a whole tree in a Hashmap and then just change the variables inside the tree and parse it or? Think my grammar is pretty much horrible aswell for the task.

Since i didnt really work with antlr in the past, i appreciate every help. Hope someone can help me. And sorry for my bad english, i hope you're able to understand me.

Kind regards

FelRPI

like image 888
FelRPI Avatar asked Oct 19 '22 23:10

FelRPI


1 Answers

One way to do so is by parsing your Concrete Syntax Tree into a Abstract Syntax Tree. Then your function object stores the AST and parse it when called, which is usually a lot simpler. Considering your example, f(a, b) = 2 * a * b, this would be parsed into a similar concrete syntax tree:

Concrete Syntax Tree

Of course you can understand it well, but it isn't easy to parse! The expression 2 * a * b is written pretty much like a string, you don't quite know the operator precedence by looking at the tree (does 2 + a * b means 2 + (a * b) or (2 + a) * b?) and it would take some time to traverse it in the correct order.

However, we can parse it only once to build something more reliable, easier for a machine to understand:

Abstract Syntax Tree

Oh yes, now it is really easy to parse: its called with params.length parameters, you set them in a HashMap or whatever representing your scope, then you call the "function" * with parameters 2 and the expression *(a,b), which is another function, so we call it by passing a and b to the "function" *. Of course this is easily extensible to user-defined functions.

Considering the function abs (value) = max(value, -value), we would build a AST similar to:

Abs function AST

Parsing the AST is easy, ok. But what about building it? Not too hard either, if we consider all the operators as functions (most of type (num, num) -> num, some (unary) of type (num) -> num). We have a very stable definition for a node in this tree:

interface AstNode {
   double eval(Scope scope); // you can look at scope as a HashMap<String, double>
}

class TerminalNode : AstNode {
   String varName;
   double constantValue;

   public TerminalNode(String varName) {
      this.varName = varName;
   }
   public TerminalNode(double constantValue) {
      this.constantValue = constantValue;
      this.varName = null;
   }

   public double eval(Scope scope) {
      if (varName == null) return constantValue;
      return scope.get(varName);
   }
}

class CallNode : AstNode {
   Function target;
   String[] parameters;
   AstNode[] children;

   public CallNode(Function target, String[] parameters, AstNode[] children) {
      this.target = target;
      this.parameters = parameters;
      this.children = children;
   }

   public double eval(Scope scope) {
      double[] subExpressions = new double[children.length];
      Scope innerScope = new Scope(scope); // creates a copy
      for (int i = 0; i < children.length; i++) {
         // I'm using the copy here because of the edge-case: f(x) = g(x) + x; g(x) = x * 2;
         // In this case, the x variable is overriden in the innerScope when we resolve g(x)
         // but we "stored" the previous x value in the "outerScope" so we can add it later
         subExpressions[i] = children[i].eval(scope);
         innerScope.set(target.getParamName(i), subExpressions[i]);
      }
      // usually, you could do target.getNode().eval(innerScope)
      // however, this would limit the target function to only be user-defined functions
      // we leave this way so you can override the Function's eval method to a built-in
      return target.eval(innerScope);
   }
}

This may be an over-simplification, but is a good starting point. As you notice, a CallNode have other AstNode children, so it is a bit of an infinite recursion, broken when every children is a TerminalNode (variable or constant). As commented in the code, you could build your operators as members of the Function class, either by instantiating or extension of the class. Of course, this depends on your Function implementation. Another way is to build another AstNode class, the BuiltinNode (or even all the nodes PlusNode, DivideNode, etc) to solve the call using primitives.


Adding this to answer the comment, "how to make use of a built AST". Suppose you have an Function object called g, which stores the AST for the expressions f(x, y) = 2 * a * b. How to achieve the value of f(4, 2)?

To do this, we use the Scope object (or a HashMap<String, double> for what it matters). We create a scope for the function where its parameters have been determined, then we call it using the AST, which will use this scope for its inner levels.

The code can look something like:

Scope scope = new Scope();
scope.set("x", 4);
scope.set("y", 2);
// remember I stored the function f on the object g, just to make a distinction
// also, note this is equivalent to g.getNode().eval(scope), because the function stored in g is not a built-in!
System.out.println(g.eval(scope));

To solve this eval query, the AST will have the scope {x: 4, y: 2} beforehand (we created it), and will call the function *(a, b), with a=2 and b=x*y. To solve the first * function call we need to solve both its arguments (a and b). To solve a is easy, because it is a terminal node (the eval will return the terminal immediatly). To solve b, we need to call the eval of the inner function, generating a new scope {x: 4, y: 2, a: x, b: y}, where a and b are the arguments of the second * function. Both a and b will be solved as terminal nodes, then the second call to * can calculate its value (by the built-in function, which calculates 4*2=8) and return it to the caller, which was the b for the first * function.

Now having a scope like {x: 4, y: 2, a: 2, b: 8}, where x and y are the parameters of f and a and b are the arguments to the * function. With all arguments set, we can call the * built-in function, yielding 16, which is the result of the function!

Images generated by http://ironcreek.net/phpsyntaxtree

like image 75
Mephy Avatar answered Oct 28 '22 22:10

Mephy