Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to understand, when do semantic predicates work and when they don't?

Below are two grammars.

In this grammar, semantic predicates do "work". I.e. if they are false, rules don't match and if they are true, rules do match:

expr
    : term
    | expr asterisk expr (asterisk expr)*
    | expr plus expr (plus expr)*
    ;

plus: {_input.LT(1).getText().equals("+")}? SPECID;
asterisk: {_input.LT(1).getText().equals("*")}? SPECID;


term
    : ALNUMID
    | STRID
    | LPAREN expr RPAREN
    ;

and in this grammar, semantic predicates don't work. I.e. if they are false, rules still match bu error is printed.

expr
    : add
    | mult
    | term
    ;


mult
    : term (asterisk) term ((asterisk) term)*
    ;

add
    : (term|mult) plus (term|mult) (plus (term|mult))*
    ;


plus: {_input.LT(1).getText().equals("+")}? SPECID;
asterisk: {_input.LT(1).getText().equals("*")}? SPECID;



term
    : ALNUMID
    | STRID
    | LPAREN expr RPAREN
    ;

In both cases I am trying to run simple tests to parse 2 + 2 and 2 * 2.

Is it possible to understand beforehand, will definition work or not? What is the logic here?


I tried the suggestion of @MickeLischke, but it didn't work

Here is my grammar

expr
    : add
    | mult
    | term
    ;


mult
    : term asterisk term ((asterisk) term)*
    ;

add
    : (term|mult) plus (term|mult) (plus (term|mult))*
    ;


plus
    : {_input.LT(1).getText().equals("+")}? SPECID
    | // Empty by design
    ;

asterisk
    :
    {_input.LT(1).getText().equals("*")}? SPECID
    | // Empty by design
    ;



term
    : ALNUMID
    | STRID
    | LPAREN expr RPAREN
    ;

here are testing code

public class MyParserTest {

    private static MyParser parse(String input) {
        CharStream charStream = CharStreams.fromString(input);

        // Create lexer
        MyLexer lexer = new MyLexer(charStream);

        // Create token stream
        CommonTokenStream tokens = new CommonTokenStream(lexer);

        // Create parser
        MyParser parser = new MyParser(tokens);

        parser.addErrorListener(new BaseErrorListener() {
            @Override
            public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol,
                                    int line, int charPositionInLine, String msg,
                                    RecognitionException e) {
                System.err.println("Rule stack: " + ((Parser)recognizer).getRuleInvocationStack());
                System.err.println("line " + line + ":" + charPositionInLine + " " + msg);

                // Get and print all tokens
                CommonTokenStream tokens = (CommonTokenStream) ((Parser)recognizer).getTokenStream();
                tokens.fill();
                System.err.println("Tokens:");
                for (Token token : tokens.getTokens()) {
                    System.err.println(token);
                }
            }
        });

        // Parse starting at 'expr' rule
        return parser;
    }

    private static String parseToLisp(String input) {
        MyParser parser = parse(input);
        return parser.expr().toStringTree(parser);
    }

    @Test
    public void testSimpleMultiplication() throws Exception {
        String value = "1 * 2";
        String actual = parseToLisp(value);
        String expected = "(expr (expr (term 1)) (mul *) (expr (term 2)))";

        assertEquals(expected, actual);
    }

    @Test
    public void testSimpleAddition() throws Exception {
        String value = "1 + 2";
        String actual = parseToLisp(value);
        String expected = "(expr (add (term 1) (plus +) (term 2)))";

        assertEquals(expected, actual);
    }

the output for multiplication is

line 1:2 no viable alternative at input '*'
Rule stack: [plus, add, expr]
line 1:2 no viable alternative at input '*'
Tokens:
[@0,0:0='1',<11>,1:0]
[@1,2:2='*',<12>,1:2]
[@2,4:4='2',<11>,1:4]
[@3,5:4='<EOF>',<-1>,1:5]

Expected :(expr (expr (term 1)) (mul *) (expr (term 2)))
Actual   :(expr (add (term 1) (plus *) (term 2)))
like image 967
Dims Avatar asked Dec 09 '25 16:12

Dims


1 Answers

When you have a rule with only one alt and that alt is not viable because of the predicate, the entire rule doesn't match at all. If it is a mandatory part of your grammar, you will get an error.

However, it seems to you want to use the predicates to make some code optional. You have two options here to make that work:

  1. Add an empty alternative to the rule with only one alt and the predicate to allow an alternative path:
plus: 
    {_input.LT(1).getText().equals("+")}? SPECID;
    | // Empty by design.
  1. Ensure that the rule which has the single alt with the predicate is referenced in a path that is optional.

The second option is what makes your first grammar work. Both asterisk and plus are referenced in a block with zero or more occurences. Another way would be to make a rule reference optional (plus?).

Don't forget to consume the + and * input once you checked for their existence:

plus: 
    {_input.LT(1).getText().equals("+")}? '+' SPECID;
    | // Empty by design.
like image 72
Mike Lischke Avatar answered Dec 11 '25 04:12

Mike Lischke