I am working with a CMS system that is insisting on putting lots of junk markup & empty tags between </figure> and <figcaption>tags.
I'm trying to use a regular expression to match & remove this junk (sadly fixing the CMS isn't possible).
I seem to have created a regex that's a bit too hungry and is also stripping the tags.
$str = '<p></p><figure class="image"><img title="Screenshot 2014-08-26 16.34.12.png" alt="Screenshot 2014-08-26 16.34.12.png" src="/image/Screenshot%202014-08-26%2016.34.12.png" class="image-style-none" typeof="foaf:Image"></figure><p></p>
<p>Â </p>
<p></p><figcaption>Screenshot 2014-08-26 16.34.12.png</figcaption><p></p>
<p> </p>
<p> </p>
<p></p>';
preg_replace('#(</figure>).*?(<figcaption>)#s', '[replace-me]', $str);
Can anyone point me in the right direction?
preg_replace('#(?<=<\/figure>)(.*?)(?=<figcaption>)#ms', '[replace-me]', $str));
Aren't regular expressions just so fun!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With