Here is a simple rule:
NAME : 'name1' | 'name2' | 'name3';
Is it possible to provide alternatives for such rule dynamically using an array that contains strings?
Yes, dynamic tokens match IDENTIFIER rule
In that case, simply do a check after the Id
has matched completely to see if the text the Id
matched is in a predefined collection. If it is in the collection (a Set
in my example) change the type of the token.
A small demo:
grammar T;
@lexer::members {
private java.util.Set<String> special;
public TLexer(ANTLRStringStream input, java.util.Set<String> special) {
super(input);
this.special = special;
}
}
parse
: (t=. {System.out.printf("\%-10s'\%s'\n", tokenNames[$t.type], $t.text);})* EOF
;
Id
: ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')*
{if(special.contains($text)) $type=Special;}
;
Int
: '0'..'9'+
;
Space
: (' ' | '\t' | '\r' | '\n') {skip();}
;
fragment Special : ;
And if you now run the following demo:
import org.antlr.runtime.*;
public class Main {
public static void main(String[] args) throws Exception {
String source = "foo bar baz Mu";
java.util.Set<String> set = new java.util.HashSet<String>();
set.add("Mu");
set.add("bar");
TLexer lexer = new TLexer(new ANTLRStringStream(source), set);
TParser parser = new TParser(new CommonTokenStream(lexer));
parser.parse();
}
}
You will see the following being printed:
Id 'foo'
Special 'bar'
Id 'baz'
Special 'Mu'
For ANTLR4, you can do something like this:
grammar T;
@lexer::members {
private java.util.Set<String> special = new java.util.HashSet<>();
public TLexer(CharStream input, java.util.Set<String> special) {
this(input);
this.special = special;
}
}
tokens {
Special
}
parse
: .*? EOF
;
Id
: [a-zA-Z_] [a-zA-Z_0-9]* {if(special.contains(getText())) setType(TParser.Special);}
;
Int
: [0-9]+
;
Space
: [ \t\r\n] -> skip
;
test it with the class:
import org.antlr.v4.runtime.*;
import java.util.HashSet;
import java.util.Set;
public class Main {
public static void main(String[] args) {
String source = "foo bar baz Mu";
Set<String> set = new HashSet<String>(){{
add("Mu");
add("bar");
}};
TLexer lexer = new TLexer(CharStreams.fromString(source), set);
CommonTokenStream tokenStream = new CommonTokenStream(lexer);
tokenStream.fill();
for (Token t : tokenStream.getTokens()) {
System.out.printf("%-10s '%s'\n", TParser.VOCABULARY.getSymbolicName(t.getType()), t.getText());
}
}
}
which will print:
Id 'foo'
Special 'bar'
Id 'baz'
Special 'Mu'
EOF '<EOF>'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With