I am a newbie to Antlr and I wanted the below implementation to be done using Antlr4. I am having the below-written functions.
1. FUNCTION.add(Integer a,Integer b)
2. FUNCTION.concat(String a,String b)
3. FUNCTION.mul(Integer a,Integer b)
And I am storing the functions metadata like this.
Map<String,String> map=new HashMap<>();
map.put("FUNCTION.add","Integer:Integer,Integer");
map.put("FUNCTION.concat","String:String,String");
map.put("FUNCTION.mul","Integer:Integer,Integer");
Where, Integer:Integer,Integer
represents Integer
is the return type and input params the function will accespts are Integer,Integer
.
if the input is something like this
FUNCTION.concat(Function.substring(String,Integer,Integer),String)
or
FUNCTION.concat(Function.substring("test",1,1),String)
Using the visitor implementation I wanted to check whether the input is validate or not against the functions metadata stored in map.
Below is the lexer and parser that I'm using:
Lexer MyFunctionsLexer.g4:
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
Parser MyFunctionsParser.g4:
parser grammar MyFunctionsParser;
options {
tokenVocab=MyFunctionsLexer;
}
function : FUNCTION '.' NAME '('(function | argument (',' argument)*)')';
argument: (NAME | function);
WS : [ \t\r\n]+ -> skip;
I am using Antlr4.
Below is the implementation I'm using as per the suggested answer.
Visitor Implementation: public class FunctionValidateVisitorImpl extends MyFunctionsParserBaseVisitor {
Map<String, String> map = new HashMap<String, String>();
public FunctionValidateVisitorImpl()
{
map.put("FUNCTION.add", "Integer:Integer,Integer");
map.put("FUNCTION.concat", "String:String,String");
map.put("FUNCTION.mul", "Integer:Integer,Integer");
map.put("FUNCTION.substring", "String:String,Integer,Integer");
}
@Override
public String visitFunctions(@NotNull MyFunctionsParser.FunctionsContext ctx) {
System.out.println("entered the visitFunctions::");
for (int i = 0; i < ctx.getChildCount(); ++i)
{
ParseTree c = ctx.getChild(i);
if (c.getText() == "<EOF>")
continue;
String top_level_result = visit(ctx.getChild(i));
System.out.println(top_level_result);
if (top_level_result == null)
{
System.out.println("Failed semantic analysis: "+ ctx.getChild(i).getText());
}
}
return null;
}
@Override
public String visitFunction( MyFunctionsParser.FunctionContext ctx) {
// Get function name and expected type information.
String name = ctx.getChild(2).getText();
String type=map.get("FUNCTION." + name);
if (type == null)
{
return null; // not declared in function table.
}
String result_type = type.split(":")[0];
String args_types = type.split(":")[1];
String[] expected_arg_type = args_types.split(",");
int j = 4;
ParseTree a = ctx.getChild(j);
if (a instanceof MyFunctionsParser.FunctionContext)
{
String v = visit(a);
if (v != result_type)
{
return null; // Handle type mismatch.
}
} else {
for (int i = j; i < ctx.getChildCount(); i += 2)
{
ParseTree parameter = ctx.getChild(i);
String v = visit(parameter);
if (v != expected_arg_type[(i - j)/2])
{
return null; // Handle type mismatch.
}
}
}
return result_type;
}
@Override
public String visitArgument(ArgumentContext ctx){
ParseTree c = ctx.getChild(0);
if (c instanceof TerminalNodeImpl)
{
// Unclear if what this is supposed to parse:
// Mutate "1" to "Integer"?
// Mutate "Integer" to "String"?
// Or what?
return c.getText();
}
else
return visit(c);
}
}
Testcalss:
public class FunctionValidate {
public static void main(String[] args) {
String input = "FUNCTION.concat(FUNCTION.substring(String,Integer,Integer),String)";
ANTLRInputStream str = new ANTLRInputStream(input);
MyFunctionsLexer lexer = new MyFunctionsLexer(str);
CommonTokenStream tokens = new CommonTokenStream(lexer);
MyFunctionsParser parser = new MyFunctionsParser(tokens);
parser.removeErrorListeners(); // remove ConsoleErrorListener
parser.addErrorListener(new VerboseListener()); // add ours
FunctionsContext tree = parser.functions();
FunctionValidateVisitorImpl visitor = new FunctionValidateVisitorImpl();
visitor.visit(tree);
}
}
Lexer:
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
WS : [ \t\r\n]+ -> skip;
Parser:
parser grammar MyFunctionsParser;
options { tokenVocab=MyFunctionsLexer; }
functions : function* EOF;
function : FUNCTION '.' NAME '(' (function | argument (',' argument)*) ')';
argument: (NAME | function);
Verbose Listener:
public class VerboseListener extends BaseErrorListener {
@Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
List<String> stack = ((Parser)recognizer).getRuleInvocationStack();
Collections.reverse(stack);
throw new FunctionInvalidException("line "+line+":"+charPositionInLine+" at "+ offendingSymbol+": "+msg);
}
}
Output:
It is not entering visitor implementation as it is not printing System.out.println("entered the visitFunctions::");
statement.
The Visitor pattern represents an operation to be performed on the elements of an object structure without changing the classes on which it operates. This pattern can be observed in the operation of a taxi company. When a person calls a taxi company (accepting a visitor), the company dispatches a cab to the customer.
Visitor design pattern is one of the behavioral design patterns. It is used when we have to perform an operation on a group of similar kind of Objects. With the help of visitor pattern, we can move the operational logic from the objects to another class.
The visitor pattern is useful when you want to process a data structure containing different kinds of objects, and you want to perform a specific operation on each of them, depending on its type. Your example is not the best, since you pass a single homogeneous list as input, so there is really no need for the pattern.
Below is a solution in C#. This should give you an idea of how to proceed. You should be able to easily translate the code to Java.
For ease, I implemented the code using my extension AntlrVSIX for Visual Studio 2019 with NET Core C#. It makes life easier using a full IDE that supports the building of split lexer/parser grammars, debugging, and a plug-in that is suited for editing Antlr grammars.
There are several things to note with your grammar. First, your parser grammar isn't accepted by Antlr 4.7.2. Production "WS : [ \t\r\n]+ -> skip;" is a lexer rule, it can't go in a parser grammar. It has to go into the lexer grammar (or you define a combined grammar). Second, I personally wouldn't define lexer symbols like DOT, and then use in the parser the RHS of the lexer symbol directly in the parser grammar, e.g., '.'. It's confusing, and I'm pretty sure there isn't an IDE or editor would know how to go to the definition "DOT: '.';" in the lexer grammar if you positioned your cursor on the '.' in the parser grammar. I never understood why it's allowed in Antlr, but c'est la vie. I would instead use the lexer symbol you define. Third, I would consider augmenting the parser grammar in the usual way with EOF, e.g., "functions : function* EOF". But, this is entirely up to you.
Now, on the problem statement, your example input contains an inconsistency. In the first case, "substring(String,Integer,Integer)", the input is in a meta-like description of substring(). In the second case, "substring(\"test\",1,1)", you are parsing code. The first case parses with your grammar, the second does not--there's no string literal lexer rule defined in your lexer grammar. It's unclear what you really want to parse.
Overall, I defined the visitor code over strings, i.e., each method returns a string representing the output type of the function or argument, e.g., "Integer" or "String" or null if there was an error (or you could throw an exception for static semantic errors). Then, using Visit() on each child in the parse tree node, check the resulting string if it is expected, and handle matches as you like.
One other thing to note. You can solve this problem via a visitor or listener class. The visitor class is useful for purely synthesized attributes. In this example solution, I return a string that represents the type of the function or arg up the associated parse tree, checking the value for each important child. The listener class is useful for L-attributed grammars--i.e., where you are passing attributes in a DFS-oriented manner, left to right at each node in the tree. For this example, you could use the listener class and only override the Exit() functions, but you would then need a Map/Dictionary to map a "context" into an attribute (string).
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
WS : [ \t\r\n]+ -> skip;
parser grammar MyFunctionsParser;
options { tokenVocab=MyFunctionsLexer; }
functions : function* EOF;
function : FUNCTION '.' NAME '(' (function | argument (',' argument)*) ')';
argument: (NAME | function);
using Antlr4.Runtime;
namespace AntlrConsole2
{
public class Program
{
static void Main(string[] args)
{
var input = @"FUNCTION.concat(FUNCTION.substring(String,Integer,Integer),String)";
var str = new AntlrInputStream(input);
var lexer = new MyFunctionsLexer(str);
var tokens = new CommonTokenStream(lexer);
var parser = new MyFunctionsParser(tokens);
var listener = new ErrorListener<IToken>();
parser.AddErrorListener(listener);
var tree = parser.functions();
if (listener.had_error)
{
System.Console.WriteLine("error in parse.");
}
else
{
System.Console.WriteLine("parse completed.");
}
var visitor = new Validate();
visitor.Visit(tree);
}
}
}
namespace AntlrConsole2
{
using System;
using Antlr4.Runtime.Misc;
using System.Collections.Generic;
class Validate : MyFunctionsParserBaseVisitor<string>
{
Dictionary<String, String> map = new Dictionary<String, String>();
public Validate()
{
map.Add("FUNCTION.add", "Integer:Integer,Integer");
map.Add("FUNCTION.concat", "String:String,String");
map.Add("FUNCTION.mul", "Integer:Integer,Integer");
map.Add("FUNCTION.substring", "String:String,Integer,Integer");
}
public override string VisitFunctions([NotNull] MyFunctionsParser.FunctionsContext context)
{
for (int i = 0; i < context.ChildCount; ++i)
{
var c = context.GetChild(i);
if (c.GetText() == "<EOF>")
continue;
var top_level_result = Visit(context.GetChild(i));
if (top_level_result == null)
{
System.Console.WriteLine("Failed semantic analysis: "
+ context.GetChild(i).GetText());
}
}
return null;
}
public override string VisitFunction(MyFunctionsParser.FunctionContext context)
{
// Get function name and expected type information.
var name = context.GetChild(2).GetText();
map.TryGetValue("FUNCTION." + name, out string type);
if (type == null)
{
return null; // not declared in function table.
}
string result_type = type.Split(":")[0];
string args_types = type.Split(":")[1];
string[] expected_arg_type = args_types.Split(",");
const int j = 4;
var a = context.GetChild(j);
if (a is MyFunctionsParser.FunctionContext)
{
var v = Visit(a);
if (v != result_type)
{
return null; // Handle type mismatch.
}
} else {
for (int i = j; i < context.ChildCount; i += 2)
{
var parameter = context.GetChild(i);
var v = Visit(parameter);
if (v != expected_arg_type[(i - j)/2])
{
return null; // Handle type mismatch.
}
}
}
return result_type;
}
public override string VisitArgument([NotNull] MyFunctionsParser.ArgumentContext context)
{
var c = context.GetChild(0);
if (c is Antlr4.Runtime.Tree.TerminalNodeImpl)
{
// Unclear if what this is supposed to parse:
// Mutate "1" to "Integer"?
// Mutate "Integer" to "String"?
// Or what?
return c.GetText();
}
else
return Visit(c);
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With