Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ANTLR4 grammar regex and tilde

Tags:

grammar

antlr4

I want to have an ANTLR grammar for CSV input.

  1. What's the difference between (~["])+ and (~['"'])+ ?

  2. Why ~ is important?

Here is my grammar:

grammar Exercice4;

csv : ligne+
    ;


ligne : exp (',' exp)* ('\n' | EOF)
  ;

exp : ID
    | INT
    | STRING
    ;

INT : '0'..'9'+;

ID : ('0'..'9' | 'a'..'z' | 'A'..'Z')+;

STRING : '"' (~["])+ '"';

WS : [ ,\n, \t, \r] -> skip;
like image 304
Naz Dodin Avatar asked Oct 22 '25 03:10

Naz Dodin


1 Answers

In a lexer rule, the characters inside square brackets define a character set. So ["] is the set with the single character ". Being a set, every character is either in the set or not, so defining a character twice, as in [""] makes no difference, it's the same as ["].

~ negates the set, so ~["] means any character except ".

like image 164
Lucas Trzesniewski Avatar answered Oct 25 '25 17:10

Lucas Trzesniewski



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!