Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex for string between two of a symbol but not three of that symbol

Tags:

regex

php

I am parsing through a document and would like to split it up using php's preg_split().

The document is organized into sections with headings of:

==Section Title==

The problem is that each section has subsections with headings of:

===Subsection Title===

Question: Is there a way to use regex to parse through the document for things that are between two equal signs but not between three equal signs?

Thanks!

P.S. I am trying to learn regex, but I still find it pretty confusing!

like image 342
OneThreeSeven Avatar asked Sep 29 '12 23:09

OneThreeSeven


2 Answers

Here's one that should work:

(?<!=)==(?!=)(.*)(?<!=)==(?!=)

How it works:

The pattern (?<!=)==(?!=) appears twice (beginning and end). It matches two equals signs that are not preceded or followed by another equals sign using (?<!=) (negative lookbehind) and (?!=) (negative lookahead). The purpose of this is to ensure that you don't accidentally match two equals signs that are part of a larger group such as ===.

The (.*) in the middle matches whatever text exists between the two pairs of ==.

like image 55
Jon Avatar answered Sep 25 '22 19:09

Jon


I'm not sure if you are just worried about those headings, or parsing all of WikiCreole, but libraries are available for parsing WikiCreole in PHP.

http://wiki.wikicreole.org/Libraries

like image 35
Brad Avatar answered Sep 24 '22 19:09

Brad