Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle escape sequences in string literals in ANTLR 3?

Tags:

I've been looking through the ANTLR v3 documentation (and my trusty copy of "The Definitive ANTLR reference"), and I can't seem to find a clean way to implement escape sequences in string literals (I'm currently using the Java target). I had hoped to be able to do something like:

fragment  ESCAPE_SEQUENCE     : '\\' '\'' { setText("'"); }     ;  STRING       : '\'' (ESCAPE_SEQUENCE | ~('\'' | '\\'))* '\''       {          // strip the quotes from the resulting token         setText(getText().substring(1, getText().length() - 1));       }      ; 

For example, I would want the input token "'Foo\'s House'" to become the String "Foo's House".

Unfortunately, the setText(...) call in the ESCAPE_SEQUENCE fragment sets the text for the entire STRING token, which is obviously not what I want.

Is there a way to implement this grammar without adding a method to go back through the resulting string and manually replace escape sequences (e.g., with something like setText(escapeString(getText())) in the STRING rule)?

like image 360
Sam Martin Avatar asked Feb 02 '09 18:02

Sam Martin


People also ask

How do you escape a string literal?

String literal syntaxUse the escape sequence \n to represent a new-line character as part of the string. Use the escape sequence \\ to represent a backslash character as part of the string. You can represent a single quotation mark symbol either by itself or with the escape sequence \' .

Can escape sequence be included in string constant?

String literals may contain any valid characters, including escape sequences such as \n, \t, etc. Octal and hexadecimal escape sequences are technically legal in string literals, but not as commonly used as they are in character constants, and have some potential problems of running on into following text.

What is the meaning of '\ n escape sequence?

For example, \n is an escape sequence that denotes a newline character.


1 Answers

Here is how I accomplished this in the JSON parser I wrote.

STRING       @init{StringBuilder lBuf = new StringBuilder();}     :               '"'             ( escaped=ESC {lBuf.append(getText());} |               normal=~('"'|'\\'|'\n'|'\r')     {lBuf.appendCodePoint(normal);} )*             '"'                 {setText(lBuf.toString());}     ;  fragment ESC     :   '\\'         (   'n'    {setText("\n");}         |   'r'    {setText("\r");}         |   't'    {setText("\t");}         |   'b'    {setText("\b");}         |   'f'    {setText("\f");}         |   '"'    {setText("\"");}         |   '\''   {setText("\'");}         |   '/'    {setText("/");}         |   '\\'   {setText("\\");}         |   ('u')+ i=HEX_DIGIT j=HEX_DIGIT k=HEX_DIGIT l=HEX_DIGIT                    {setText(ParserUtil.hexToChar(i.getText(),j.getText(),                                                  k.getText(),l.getText()));}          )     ; 
like image 87
Bruno Ranschaert Avatar answered Sep 18 '22 17:09

Bruno Ranschaert