I'm writing an Eclipse/Xtext plugin for CoffeeScript, and I realized I'll probably need to write a lexer for it by hand. CoffeeScript parser also uses a hand-written lexer to handle indentation and other tricks in the grammar.
Xtext generates a class that extends org.eclipse.xtext.parser.antlr.Lexer
which in turn extends org.antlr.runtime.Lexer
. So I suppose I'll have extend it. I can see two ways to do that
mTokens()
. This is done by the generated code, changing the internal state.nextToken()
which seems a natural approach, but then I'll have to keep track of the internal state.I couldn't find any example how to write even a simple lexer for ANTLR without a grammar file. So the easiest answer would be a pointer to one.
An answer to Xtext: grammar for language with significant/semantic whitespace refers to todotext which handles the problem of indentation by changing the tokens in the underlying input stream. I don't want to go that way, because it would be difficult to handle other tricks of the coffeescript grammar.
UPDATE:
I realized in the meantime that my question was partly Xtext specific.
Here is what I did -- and it works.
public class MyLexer extends myprj.parser.antlr.internal.InternalMylangLexer {
private SomeExternalLexer externalLexer;
public Lexer(CharStream in) {
super(in);
externalLexer = new SomeExternalLexer(in);
}
@Override
public Token nextToken() {
Token token = null;
ExternalToken extToken = null;
try {
extToken = externalLexer.nextToken();
if (extToken == null) {
token = CommonToken.INVALID_TOKEN;
}
else {
token = mapExternalToken(extToken);
}
}
catch (Exception e) {
token = CommonToken.INVALID_TOKEN;
}
return token;
}
protected Token mapExternalToken(ExternalToken extToken) {
// ...
}
}
Then I have a slightly customized parser containing:
public class BetterParser extends MylangParser {
@Override
protected TokenSource createLexer(CharStream stream) {
MyLexer lexer = new MyLexer(stream);
return lexer;
}
}
I also had to change my MylangRuntimeModule.java
to contain this method
@Override
public Class<? extends org.eclipse.xtext.parser.IParser> bindIParser() {
return myprj.parser.BetterParser.class ;
}
And that's it.
Another way (without the need to create a custom parser) is to create a custom lexer by extending Xtext's lexer (org.eclipse.xtext.parser.antlr.Lexer) as follows:
public class CustomSTLexer extends Lexer {
@Override
public void mTokens() {
// implement lexer here
}
}
Then you bind it in your module:
@Override
public void configureRuntimeLexer(Binder binder) {
binder.bind(Lexer.class)
.annotatedWith(Names.named(LexerBindings.RUNTIME))
.to(CustomSTLexer.class);
}
If you want to have a look at a complete example, I have implemented a custom lexer for an Xtext-based editor for StringTemplate called hastee.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With