I know this question is around SO, but I can't find the right one and I still suck in Regex :/
I have an string
and that string is valid HTML. Now I want to find all the tags with an certain name
and attribute
.
I tried this regex (i.e. div with type): /(<div type="my_special_type" src="(.*?)<\/div>)/
.
Example string:
<div>Do not match me</div>
<div type="special_type" src="bla"> match me</div>
<a>not me</a>
<div src="blaw" type="special_type" > match me too</div>
If I use preg_match then I only get <div type="special_type" src="bla"> match me</div>
what is logical because the other one has the attributes in a different order.
What regex do I need to get the following array
when using preg_match
on the example string?:
array(0 => '<div type="special_type" src="bla"> match me</div>',
1 => '<div src="blaw" type="special_type" > match me too</div>')
A general advice: Dont use regex to parse HTML It will get messy if the HTML changes..
Use DOMDocument
instead:
$str = <<<EOF
<div>Do not match me</div>
<div type="special_type" src="bla"> match me</div>
<a>not me</a>
<div src="blaw" type="special_type" > match me too</div>
EOF;
$doc = new DOMDocument();
$doc->loadHTML($str);
$selector = new DOMXPath($doc);
$result = $selector->query('//div[@type="special_type"]');
// loop through all found items
foreach($result as $node) {
echo $node->getAttribute('src');
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With