I am parsing through a document and would like to split it up using php's preg_split().
The document is organized into sections with headings of:
==Section Title==
The problem is that each section has subsections with headings of:
===Subsection Title===
Question: Is there a way to use regex to parse through the document for things that are between two equal signs but not between three equal signs?
Thanks!
P.S. I am trying to learn regex, but I still find it pretty confusing!
Here's one that should work:
(?<!=)==(?!=)(.*)(?<!=)==(?!=)
How it works:
The pattern (?<!=)==(?!=)
appears twice (beginning and end). It matches two equals signs that are not preceded or followed by another equals sign using (?<!=)
(negative lookbehind) and (?!=)
(negative lookahead). The purpose of this is to ensure that you don't accidentally match two equals signs that are part of a larger group such as ===
.
The (.*)
in the middle matches whatever text exists between the two pairs of ==
.
I'm not sure if you are just worried about those headings, or parsing all of WikiCreole, but libraries are available for parsing WikiCreole in PHP.
http://wiki.wikicreole.org/Libraries
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With