Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP wiki markup parser

I've been told that writing a wiki markup parser in PHP that relies on regex and preg functions is a bad idea. But I don't know why.

So what's the best way to go about writing a wiki markup parser in PHP? This is more an academic 'project' than anything else, so the whole point is to write it myself.

Thanks in advance for your help.

like image 978
VettelS Avatar asked Aug 11 '11 01:08

VettelS


1 Answers

You've been told that because "wiki languages" are ill defined to say the least.
The really bad part is "trying to parse" them, not "using php and regexps".

In fact I believe they are actually processed with regexps (straight into html, without passing from an intermediate abstract syntax tree representation) in softwares like mediawiki. And AFAIK actual parsing without regexps is quite inefficient in PHP. (unless you're using a specific compiled PHP module for parsing)

Be aware that those softwares also have a number of syntax features that can be activated on demand, and that might prove hard to write efficiently.

Only real trouble? You have to use a lot of escapes to parse chars like [ and ], it's easy to get confused when you use many backslashes with preg_match() and php. Apart from that, a simple preg_match_all('#\\[\\[(.*?)\\]\\]#',$data,$matches,PREG_SET_ORDER); should get you up and running.

(unless I got confused by too many levels of backslashing, that is) :)

like image 140
ZJR Avatar answered Oct 17 '22 23:10

ZJR