Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find multiple patterns with a single preg_match_all in PHP

Tags:

regex

php

Using PHP and preg_match_all I'm trying to get all the HTML content between the following tags (and the tags also):

<p>paragraph text</p>
don't take this
<ul><li>item 1</li><li>item 2</li></ul>
don't take this
<table><tr><td>table content</td></tr></table>

I can get one of them just fine:

preg_match_all("(<p>(.*)</p>)siU", $content, $matches, PREG_SET_ORDER);

Is there a way to get all the

<p></p> <ul></ul> <table></table>

content with a single preg_match_all? I need them to come out in the order they were found so I can echo the content and it will make sense.

So if I did a preg_match_all on the above content then iterated through the $matches array it would echo:

<p>paragraph text</p>
<ul><li>item 1</li><li>item 2</li></ul>
<table><tr><td>table content</td></tr></table>
like image 619
Marcus Avatar asked Dec 06 '22 00:12

Marcus


1 Answers

Use | to match one of a group of strings: p|ul|table

Use backreferences to match the approriate closing tag: \\2 because the group (pl|ul|table) includes the second opening parenthesis

Putting that all together:

preg_match_all("(<(p|ul|table)>(.*)</\\2>)siU", $content, $matches, PREG_SET_ORDER);

This is only going to work if your input html follows a very strict structure. It cannot have spaces in the tags, or have any attributes in tags. It also fails when there's any nesting. Consider using an html parser to do a proper job.

like image 131
moinudin Avatar answered Dec 10 '22 09:12

moinudin