How to parse comments with EBNF grammars

Question

When defining the grammar for a language parser, how do you deal with things like comments (eg /* .... */) that can occur at any point in the text?

Building up your grammar from tags within tags seems to work great when things are structured, but comments seem to throw everything.

Do you just have to parse your text in two steps? First to remove these items, then to pick apart the actual structure of the code?

Thanks

Jonathan Leffler · Accepted Answer

Normally, comments are treated by the lexical analyzer outside the scope of the main grammar. In effect, they are (usually) treated as if they were blanks.

SK-logic · Answer

One approach is to use a separate lexer. Another, much more flexible way, is to amend all your token-like entries (keywords, lexical elements, etc.) with an implicit whitespace prefix, valid for the current context. This is how most of the modern Packrat parsers are dealing with whitespaces.

How to parse comments with EBNF grammars

Tags:

parsing

context-free-grammar

ebnf

Jagu

2 Answers

Jonathan Leffler

SK-logic

Recent Activity

Donate For Us

How to parse comments with EBNF grammars

Tags:

parsing

context-free-grammar

ebnf

Jagu

2 Answers

Jonathan Leffler

SK-logic

Related questions

Recent Activity

Donate For Us