Semantic lexer predicate performance

Question

I have a lexer creates MACRO tokens for a dynamic list of macro strings passed to the lexer. I used a semantic predicate in the very top lexer rule to implement this feature:

MACRO: { macros != null && tryMacro() }? .;

Where tryMacro() just checks if any macro string matches the input sequence.

The performance of this approach was very bad and after some research I tried changing the lexer rule to the following:

MACRO: . { macros != null && tryMacro() }?;

This severely improved the performance but I don't really understand why. :) Since the '.' matches any character, the semantic predicate rule should be invoked exactly as many times as before, shouldn't it? Can someone provide an explanation for this behavior?

Lucas Trzesniewski · Accepted Answer

The reason is pretty simple: if you put the predicate at the start, the lexer will evaluate it to decide if the MACRO rule should apply. If you put it at the end, it will only perform the check when it has a potential match for the MACRO rule.

Since MACRO is very generic, I suppose you put it at the end of the rules, and due to the priority rules it will surely get tried last. It can match only single character tokens, so more precise rules will be prioritary.

If the MACRO rule is superseded by a more prioritary rule, it won't be considered and your predicate won't be invoked.

Semantic lexer predicate performance

Tags:

antlr4

Moritz Becker

1 Answers

Lucas Trzesniewski

Recent Activity

Donate For Us

Semantic lexer predicate performance

Tags:

antlr4

Moritz Becker

1 Answers

Lucas Trzesniewski

Related questions

Recent Activity

Donate For Us