I'm looking for a regex to use in php (maybe with preg replace?) that strips in a text all unclosed < and ONLY unclosed and all the unopened > and ONLY the unopened.
Some examples:
<name> aaaaaa bbbbb <  aagfetfe <aaaa/>
to
<name> aaaaaa bbbbb   aagfetfe <aaaa/>
<<1111>sbab  < amkka <pippo>
to
<1111>sbab   amkka <pippo>
<1111> aaaa <    thehehe  > aaaaaa <ciao>
to
<1111> aaaa <    thehehe  > aaaaaa <ciao>
<1111> aaaa   thehehe  > aaaaaa <ciao>
to 
<1111> aaaa   thehehe   aaaaaa <ciao>
<1111> aaaa   thehehe  < aaaaaa
to 
<1111> aaaa   thehehe   aaaaaa
I really cant do it its too difficult for me.
$s = preg_replace("/<([^<>]*)(?=<|$)/", "$1", $s); # remove unclosed '<'
$s = preg_replace("/(^|(?<=>))([^<>]*)>/", "$1", $s); # remove unopened '>'
Do you understand why?
For unclosed <, you can replace <(?=[^>]*(<|$)) by an empty string. It matches all < which are not followed by a closing > before the next < or the end of the line. "not followed by" is a positive lookahead.
For unopened >, you can replace ((^|>)[^<]*)> by $1. It matches text which starts with an > (or the line start), does not contain < and ends with a >. $1 represents everything except the last >.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With