I have a very simple grammar that looks like this:
grammar Testing;
a : d | b;
b : {_input.LT(1).equals("b")}? C;
d : {!_input.LT(1).equals("b")}? C;
C : .;
It parses one character from the input and checks whether the it's equal to the character b. If so, rule b is used, and if not, rule d is used.
However, the parse tree fails the expectation and parses everything using the first rule (rule d).
$ antlr Testing.g4
$ javac *.java
$ grun Testing a -trace (base)
c
enter a, LT(1)=c
enter d, LT(1)=c
consume [@0,0:0='c',<1>,1:0] rule d
exit d, LT(1)=
exit a, LT(1)=
$ grun Testing a -trace (base)
b
enter a, LT(1)=b
enter d, LT(1)=b
consume [@0,0:0='b',<1>,1:0] rule d
exit d, LT(1)=
exit a, LT(1)=
In both cases, rule d is used. However, since there is a guard on rule d, I expect rule d to fail when the first character is exactly 'b'.
Am I doing something wrong when using the semantic predicates?
(I need to use semantic predicates because I need to parse a language where keywords could be used as identifiers).
Reference: https://github.com/antlr/antlr4/blob/master/doc/predicates.md
_input.LT(int) returns a Token, and Token.equals(String) will always return false. What you want to do is call getText() on the Token:
b : {_input.LT(1).getText().equals("b")}? C;
d : {!_input.LT(1).getText().equals("b")}? C;
However, often it is easier to handle keywords-as-identifiers in such a way:
rule
: KEYWORD_1 identifier
;
identifier
: IDENTIFIER
| KEYWORD_1
| KEYWORD_2
| KEYWORD_3
;
KEYWORD_1 : 'k1';
KEYWORD_2 : 'k2';
KEYWORD_3 : 'k3';
IDENTIFIER : [a-zA-Z_] [a-zA-Z_0-9]*;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With