Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the shortest way to write parser for my language?

PS.Where to read about parsing theory?

like image 939
SomeUser Avatar asked Oct 04 '09 13:10

SomeUser


People also ask

How do you write parser in Python?

Creating a parser>>> parser = argparse.ArgumentParser(description='Process some integers.') The ArgumentParser object will hold all the information necessary to parse the command line into Python data types.

How do you write parser in Java?

There are three ways of parsing in Java: Using an existing library. Using a tool or library to build a parser. By building a custom parser from scratch.


1 Answers

Summary: the shortest is probably Antlr.

Its tempting to go to the Dragon Book to learn about parsing theory. But I don't think the Dragon Book and you have the same idea of what "theory" means. The Dragon Book describes how to built hand-written parsers, parser generators, etc, but you almost certainly want to use a parser-generation tool instead.

A few people have suggested Bison and Flex (or their older versions Yacc and Lex). Those are the old stalwarts, but they are not very usable tools. Their documentation is not poor per se, its just that it doesn't quite help in getting dealing with the accidental complexity of using them. Their internal data is not well encapsulated, and it is very hard to do anything advanced with them. As an example, in phc we still do not have correct line numbers because it is very difficult. They got better when we modified out grammar to include No-op statements, but that is an incredible hack which should not be necessary.

Ostensibly, Bison and Flex work together, but the interface is awkward. Worse, there are many versions of each, which only play nicely with some specific versions of the other. And, last I checked at least, the documentation of which versions went with which was pretty poor.

Writing a recursive descent parser is straightforward, but can be tedious. Antlr can do that for you, and it seems to be a pretty good toolset, with the benefit that what you learn on this project can be applied to lots of other languages and platforms (Antlr is very portable). There are also lots of existing grammars to learn from.

Its not clear what language you're working in, but some languages have excellent parsing frameworks. In particular, the Haskell Parsec Library seems very elegant. If you use C++ you might be tempted to use Spirit. I found it very easy to get started with, and difficult--but still possible--to do advanced things with it. This matches my experience of C++ in general. I say I found it easy to start, but then I had already written a couple of parsers, and studied parsing in compiler class.

Long story short: Antlr, unless you've a very good reason.

like image 170
Paul Biggar Avatar answered Nov 10 '22 07:11

Paul Biggar