Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practices for writing a programming language parser

Are there any best practices that I should follow while writing a parser?

like image 414
Vinay Avatar asked Feb 20 '09 16:02

Vinay


People also ask

Should I write my own parser?

Education-wise, writing your own parser will teach you more than using a generator. You have to write more and more complicated code after all, plus you have to understand exactly how you parse a language.

How hard is it to write a parser?

A handwritten parser: Writing a parser by hand is a moderately difficult task. Complexity may increase if the language-grammar is complex.


1 Answers

The received wisdom is to use parser generators + grammars and it seems like good advice, because you are using a rigorous tool and presumably reducing effort and potential for bugs in doing so.

To use a parser generator the grammar has to be context free. If you are designing the languauge to be parsed then you can control this. If you are not sure then it could cost you a lot of effort if you start down the grammar route. Even if it is context free in practice, unless the grammar is enormous, it can be simpler to hand code a recursive decent parser.

Being context free does not only make the parser generator possible, but it also makes hand coded parsers a lot simpler. What you end up with is one (or two) functions per phrase. Which is if you organise and name the code cleanly is not much harder to see than a grammar (if your IDE can show you call hierachies then you can pretty much see what the grammar is).

The advantages:-

  • Simpler build
  • Better performance
  • Better control of output
  • Can cope with small deviations, e.g. work with a grammar that is not 100% context free

I am not saying grammars are always unsuitable, but often the benefits are minimal and are often out weighed by the costs and risks.

(I believe the arguments for them are speciously appealing and that there is a general bias for them as it is a way of signaling that one is more computer-science literate.)

like image 190
mike g Avatar answered Sep 26 '22 06:09

mike g