Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Building a Regex Based Parser [closed]

Tags:

regex

parsing

Is it stupid to build a regex based parser?

like image 503
redDragonzz Avatar asked Mar 22 '11 09:03

redDragonzz


2 Answers

Matching nested parens is exceedingly simple using modern patterns. Not counting whitespace, this sort of thing:

\( (?: [^()] *+ | (?0) )* \)

works for mainstream languages like Perl and PHP, plus anything that uses PCRE.

However, you really need grammatical regexes for a full parse, or you’ll go nuts. Don’t use a language whose regexes don’t support breaking regexes down into smaller units, or which don’t support proper debugging of their compilation and execution. Life’s too short for low-level hackery. Might as well go back to assembly language if you’re going to do that.

I’ve written about recursive patterns, grammatical patterns, and parsing quite a bit: for example, see here for parsing approaches and here for lexer approaches; also, the final solution here.

Also, Perl’s Regexp::Grammars module is especially useful in turning grammatical regexes into parsing structures.

So by all means, go for it. You’ll learn a lot that way.

like image 97
tchrist Avatar answered Nov 15 '22 07:11

tchrist


For work? Yes. For learning? No.

like image 23
Matt Avatar answered Nov 15 '22 07:11

Matt