This should be fairly simple. I'm working on a lexer grammar using ANTLR, and want to limit the maximum length of variable identifiers to 30 characters. I attempted to accomplish this with this line(following normal regex - except for the '' thing - syntax):
ID : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9'|'_'){0,29} {System.out.println("IDENTIFIER FOUND.");}
;
No errors in code generation, but compilation failed due to a line in the generated code that was simply:
0,29
Obviously antlr is taking the section of text between the brackets and placing it in the accept state area along with the print line. I searched the ANTLR site, and I found no example or reference to an equivalent expression. What should the syntax of this expression be?
ANTLR does not support the {m,n}
quantifier syntax. ANTLR sees the {}
of your quantifier and can't tell them apart from the {}
that surround your actions.
Workarounds:
This is an example of a manual rule that limits IDs to 8.
SUBID : ('a'..'z'|'A'..'Z'|'0'..'9'|'_')
;
ID : ('a'..'z'|'A'..'Z')
(SUBID (SUBID (SUBID (SUBID (SUBID (SUBID SUBID?)?)?)?)?)?)?
;
Personally, I'd go with the semantic solution (#1). There is very little reason these days to limit the identifiers in a language, and even less reason to cause a syntax error (early abort of the compile) when such a rule is violated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With