I've been told that writing a wiki markup parser in PHP that relies on regex and preg functions is a bad idea. But I don't know why.
So what's the best way to go about writing a wiki markup parser in PHP? This is more an academic 'project' than anything else, so the whole point is to write it myself.
Thanks in advance for your help.
You've been told that because "wiki languages" are ill defined to say the least.
The really bad part is "trying to parse" them, not "using php and regexps".
In fact I believe they are actually processed with regexps (straight into html, without passing from an intermediate abstract syntax tree representation) in softwares like mediawiki. And AFAIK actual parsing without regexps is quite inefficient in PHP. (unless you're using a specific compiled PHP module for parsing)
Be aware that those softwares also have a number of syntax features that can be activated on demand, and that might prove hard to write efficiently.
Only real trouble? You have to use a lot of escapes to parse chars like [
and ]
, it's easy to get confused when you use many backslashes with preg_match()
and php. Apart from that, a simple preg_match_all('#\\[\\[(.*?)\\]\\]#',$data,$matches,PREG_SET_ORDER);
should get you up and running.
(unless I got confused by too many levels of backslashing, that is) :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With