Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do on-line parsers seem to stop at regexps?

I've been wondering for long why there doesn't seem to be any parsers for, say, BNF, that behave like regexps in various libraries.

Sure, there's things like ANTLR, Yacc and many others that generate code which, in turn, can parse a CFG, but there doesn't seem to be a library that can do that without the intermediate step.

I'm interested in writing a Packrat parser, to boot all those nested-parenthesis-quirks associated with regexps (and, perhaps even more so, for the sport of it), but somehow I have this feeling that I'm just walking into another halting problem -like class of swamps.

Is there a technical/theoretical limitation for these parsers, or am I just missing something?

like image 328
Henrik Paul Avatar asked Apr 29 '09 17:04

Henrik Paul


1 Answers

I think it's more of a cultural thing. The use of context-free grammars is mostly confined to compilers, which typically have code associated with each production rule. In some languages, it's easier to output code than to simulate callbacks. In others, you'll see parser libraries: parser combinators in Haskell, for example. On the other hand, regular expressions see wide use in tools like grep, where it's inconvenient to run the C compiler every time the user gives a new regular expression.

like image 189
Dave Avatar answered Sep 22 '22 03:09

Dave