Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dynamic (?) parser

Does there exist a parser that generates an AST/parse tree at runtime? Kind of like a library that would accept a string of EBNF grammar or something analogous and spit out a data structure?

  • I'm aware of antlr, jlex and their ilk. They generate source code which could do this. (like to skip the compile step)
  • I'm aware of Boost::Spirit, which uses some black magic with C++ syntax to generate such things at execution time (definitely much closer to what I want, but I'm a wuss when it comes to C++. And it's still somewhat limiting, because your grammar is hardcoded)
  • I'm not aware of anything in python or ruby, although a compiler compiler might very well be effective in such a language...

Now I'm aware of parser combinators. (thanks, Jonas) And some libraries (thanks eliben)

incidentally, I also noticed Parsing Expression Grammars lately, which sounds cool were someone to implement it (they say Perl 6 will have it, but Perl evades my understanding)

like image 213
Ellery Newcomer Avatar asked Oct 09 '08 01:10

Ellery Newcomer


2 Answers

Take a look at parser combinators which i think may help you. It is possible to make parsers at runtime using this technique. One popular parser combinator is Parsec which uses Haskell as its host language. From the parsec documentation:

Combinator parsers are written and used within the same programming language as the rest of the program. There is no gap between the grammar formalism (Yacc) and the actual programming language used (C)

Parsers are first-class values within the language. They can be put into lists, passed as parameters and returned as values. It is easy extend the available set of parsers with custom made parsers specific for a certain problem

If you are using .NET take a look at the parser combinator library for F#.

like image 97
Jonas Avatar answered Nov 04 '22 01:11

Jonas


If Java is better for you, there is a port of the Haskell Parsec library - JParsec. Very powerful, though documentation isn't great.

You can coerce it to do a straight forward lex then parse phase, but you can do some interesting things with dynamic lexing and dynamic grammars.

Head twisting stuff.

Because it's all in Java (your Parser is a POJO), you can refactor, and do TDD, and whatever you're used to doing in Java. This is a major advantage to a more traditional ANTLR/JavaCC/JJTree approach.

like image 20
jamesh Avatar answered Nov 04 '22 00:11

jamesh