I'm trying to parse VBA code, and the 5.4.2.10 section of the spec defines the Select Case
statement, which we've defined as follows:
// 5.4.2.10 Select Case Statement
selectCaseStmt :
SELECT whiteSpace? CASE whiteSpace? selectExpression endOfStatement
caseClause*
caseElseClause?
END_SELECT
;
selectExpression : expression;
caseClause :
CASE whiteSpace rangeClause (whiteSpace? COMMA whiteSpace? rangeClause)* endOfStatement block
;
caseElseClause : CASE whiteSpace? ELSE endOfStatement block;
rangeClause :
expression
| selectStartValue whiteSpace TO whiteSpace selectEndValue
| (IS whiteSpace?)? comparisonOperator whiteSpace? expression
;
selectStartValue : expression;
selectEndValue : expression;
The problem is that the expression
in rangeClause
is taking precedence, and makes this:
Select Case foo Case Is = 42 Exit Sub End Select
...ultimately get picked up and treated as {undeclared-variable} {EQ} {literal}
, which is a problem, because Is
ought to be a lexer token, not the LHS of a comparison expression:
expression whiteSpace? (EQ | NEQ | LT | GT | LEQ | GEQ | LIKE | IS) whiteSpace? expression # relationalOp
I tried reordering the alternatives so that the expression
branch has lower precedence, like this:
rangeClause :
selectStartValue whiteSpace TO whiteSpace selectEndValue
| (IS whiteSpace?)? comparisonOperator whiteSpace? expression
| expression
;
But that broke the entire grammar in all kinds of ways (breaks ~1000 tests in my project), so instead I tried changing the rangeClause
to this (removed optional tokens, because Is
without =
is actually illegal VBA code):
rangeClause :
expression (whiteSpace TO whiteSpace expression)? #caseFromTo
| (IS whiteSpace comparisonOperator whiteSpace)? expression #caseIs
;
And then working with CaseFromToContext
and CaseIsContext
classes in the code (had to, to keep it compiling), but again it broke ~1000 tests in my project.
Then I figured, "hey that's potentially ambiguous!" and turned it into this:
rangeClause :
expression whiteSpace TO whiteSpace expression #caseFromTo
| IS whiteSpace comparisonOperator whiteSpace expression #caseIs
| expression #caseExpr
;
...but no luck, same identical outcome.
How can I make the rangeClause
understand this annoying Case Is = foobar
syntax? I'm using ANTLR 4.3, but we're planning to upgrade to ANTLR 4.6 soon-ish.
If additional context is needed, the complete VBAParser.g4 grammar is on github.
Turns out that re-ordering actually does work, but in order to keep the ambiguity out of the parse, the IS whiteSpace comparisonOperator
has to come first:
rangeClause :
(IS whiteSpace?)? comparisonOperator whiteSpace? expression
| selectStartValue whiteSpace TO whiteSpace selectEndValue
| expression
The problem is with expression
(and by extension selectStartValue
and selectEndValue
) which will recursively match Is =
because comparisonOperator comparisonOperator
is an expression match. There's probably some work that can be done to prevent comparisonOperator comparisonOperator
from matching expression
(it's never valid in VBA AFAIK), but the above works as a quick and dirty fix.
Basically all the above grammar does is ensure that the "invalid" comparisonOperator comparisonOperator
matches as a rangeClause
before it can be matched as an expression
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With