Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Semantic lexer predicate performance

Tags:

antlr4

I have a lexer creates MACRO tokens for a dynamic list of macro strings passed to the lexer. I used a semantic predicate in the very top lexer rule to implement this feature:

MACRO: { macros != null && tryMacro() }? .;

Where tryMacro() just checks if any macro string matches the input sequence.

The performance of this approach was very bad and after some research I tried changing the lexer rule to the following:

MACRO: . { macros != null && tryMacro() }?;

This severely improved the performance but I don't really understand why. :) Since the '.' matches any character, the semantic predicate rule should be invoked exactly as many times as before, shouldn't it? Can someone provide an explanation for this behavior?

like image 292
Moritz Becker Avatar asked Feb 02 '26 18:02

Moritz Becker


1 Answers

The reason is pretty simple: if you put the predicate at the start, the lexer will evaluate it to decide if the MACRO rule should apply. If you put it at the end, it will only perform the check when it has a potential match for the MACRO rule.

Since MACRO is very generic, I suppose you put it at the end of the rules, and due to the priority rules it will surely get tried last. It can match only single character tokens, so more precise rules will be prioritary.

If the MACRO rule is superseded by a more prioritary rule, it won't be considered and your predicate won't be invoked.

like image 196
Lucas Trzesniewski Avatar answered Feb 04 '26 15:02

Lucas Trzesniewski



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!