What is the best way to build a parser in c# to parse my own language? Ideally I'd like to provide a grammar, and get Abstract Syntax Trees as an output. Many thanks, Nestor
LR Parsing: LR parser is one of the most efficient syntax analysis techniques as it works with context-free grammar.
C is a bit hard to parse because statements like `A * B();` will mean different things if A is defined as a type or note. C++ is much harder to parse because the template syntax is hard to disambiguate from less than or greater than.
If you know exactly what language you are going to parse, writing a hand-written parser is straightforward (although laborious). If you don't know the language, then refactoring parsers can be quite difficult. You need good test cases not to break corner cases.
I've had good experience with ANTLR v3. By far the biggest benefit is that it lets you write LL(*) parsers with infinite lookahead - these can be quite suboptimal, but the grammar can be written in the most straightforward and natural way with no need to refactor to work around parser limitations, and parser performance is often not a big deal (I hope you aren't writing a C++ compiler), especially in learning projects.
It also provides pretty good means of constructing meaningful ASTs without need to write any code - for every grammar production, you indicate the "crucial" token or sub-production, and that becomes a tree node. Or you can write a tree production.
Have a look at the following ANTLR grammars (listed here in order of increasing complexity) to get a gist of how it looks and feels
I've played wtih Irony. It looks simple and useful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With