I am parsing through a document and would like to split it up using php's preg_split().
The document is organized into sections with headings of:
==Section Title==
The problem is that each section has subsections with headings of:
===Subsection Title===
Question: Is there a way to use regex to parse through the document for things that are between two equal signs but not between three equal signs?
Thanks!
P.S. I am trying to learn regex, but I still find it pretty confusing!
Here's one that should work:
(?<!=)==(?!=)(.*)(?<!=)==(?!=)
How it works:
The pattern (?<!=)==(?!=) appears twice (beginning and end). It matches two equals signs that are not preceded or followed by another equals sign using (?<!=) (negative lookbehind) and (?!=) (negative lookahead). The purpose of this is to ensure that you don't accidentally match two equals signs that are part of a larger group such as ===.
The (.*) in the middle matches whatever text exists between the two pairs of ==.
I'm not sure if you are just worried about those headings, or parsing all of WikiCreole, but libraries are available for parsing WikiCreole in PHP.
http://wiki.wikicreole.org/Libraries
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With