Building a Regex Based Parser [closed]

Question

Is it stupid to build a regex based parser?

tchrist · Accepted Answer

Matching nested parens is exceedingly simple using modern patterns. Not counting whitespace, this sort of thing:

$ (?: [^()] *+ | (?0) )* $

works for mainstream languages like Perl and PHP, plus anything that uses PCRE.

However, you really need grammatical regexes for a full parse, or you’ll go nuts. Don’t use a language whose regexes don’t support breaking regexes down into smaller units, or which don’t support proper debugging of their compilation and execution. Life’s too short for low-level hackery. Might as well go back to assembly language if you’re going to do that.

I’ve written about recursive patterns, grammatical patterns, and parsing quite a bit: for example, see here for parsing approaches and here for lexer approaches; also, the final solution here.

Also, Perl’s Regexp::Grammars module is especially useful in turning grammatical regexes into parsing structures.

So by all means, go for it. You’ll learn a lot that way.

Matt · Answer

For work? Yes. For learning? No.

Building a Regex Based Parser [closed]

Tags:

regex

parsing

redDragonzz

2 Answers

tchrist

Matt

Recent Activity

Donate For Us

Building a Regex Based Parser [closed]

Tags:

regex

parsing

redDragonzz

2 Answers

tchrist

Matt

Related questions

Recent Activity

Donate For Us