Unindented code breaks my grammar

Question

I have a .g4 grammar for vba/vb6 a lexer/parser, where the lexer is skipping line continuation tokens - not skipping them breaks the parser and isn't an option. Here's the lexer rule in question:

LINE_CONTINUATION : ' ' '_' '
'? '
' -> skip;

The problem this is causing, is that whenever a continued line starts at column 1, the parser blows up:

Sub Test()
Debug.Print "Some text " & _
vbNewLine & "Some more text"    
End Sub

I thought "Hey I know! I'll just pre-process the string I'm feeding ANTLR to insert an extra whitespace before the underscore, and change the grammar to accept it!"

So I changed the rule like this:

LINE_CONTINUATION : WS? WS '_' NEWLINE -> skip;
NEWLINE : WS? ('
'? '
') WS?; 
WS : [ 	]+;

...and the test vba code above gave me this parser error:

extraneous input 'vbNewLine' expecting WS

For now my only solution is to tell my users to properly indent their code. Is there any way I can fix that grammar rule?

_{(Full VBA.g4 grammar file on GitHub)}

Ira Baxter · Accepted Answer

You basically want line continuation to be treated like whitespace.

OK, then add the lexical definition of line continuation to the WS token. Then WS will pick up the line continuation, and you don't need the LINECONTINUATION anywhere.

//LINE_CONTINUATION : ' ' '_' '
'? '
' -> skip;
NEWLINE : WS? ('
'? '
') WS?; 
WS : ([ 	]+)|(' ' '_' '
'? '
');

Unindented code breaks my grammar

Tags:

parsing

grammar

antlr4

Mathieu Guindon

1 Answers

Ira Baxter

Recent Activity

Donate For Us

Unindented code breaks my grammar

Tags:

parsing

grammar

antlr4

Mathieu Guindon

1 Answers

Ira Baxter

Related questions

Recent Activity

Donate For Us