I have a text area of text that can possibly contain tags in them. I need to identify these tags with a specific string of text in the src attribute ("/storyimages/") and delete them. So for instance, if I have the text
<br><img src="/storyimages/myimage.jpg" align="right" WIDTH="105" HEIGHT="131"><b>(CNS) </b>Lorem ipsum dolor...
I just want to get rid of the whole tag and replace it with ''. The regex pattern I'm trying to use is
/<img src=.*\/storyimages\/.*>/
but it's not working. What happens is that it identifies the start of the string ok, but it's not identifying the closing > character, so if I use preg_match(), the match starts with .
I know you're not supposed to use a regex on HTML, but this isn't embedded tags; it's just one tag in the midst of a bunch of text, so I should be ok. From what I can see, the > isn't a special character, but even if I escape it, I still get the same result.
Is there something simple I'm missing that would make this work? Or do I need to write some sort of function that loops over the string character by character to find the positions of the open and close brackets and then replace them?
The interesting thing is that when I try this with a regex tester, it works fine, but when I actually run the code, I get the problem described above.
Thanks.
Use <img src=.*?\/storyimages\/.*?> regex.
The main point is using *? quanitifier to make matching non-greedy (i.e. match the least matching characters as possible).
Here is a sample PHP code:
$re = "/<img src=.*?\\/storyimages\\/.*?>/";
$str = "<br><img src=\"/storyimages/myimage.jpg\" align=\"right\" WIDTH=\"105\" HEIGHT=\"131\"><b>(CNS) </b>Lorem ipsum dolor...";
preg_match($re, $str, $matches);
The match will look like <img src="/storyimages/myimage.jpg" align="right" WIDTH="105" HEIGHT="131">.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With