Here's the goal: to replace all standalone ampersands with & but NOT replace those that are already part of an HTML entity such as .
I think I need a regular expression for PHP (preferably for preg_ functions) that will match only standalone ampersands. I just don't know how to do that with preg_replace.
PHP's htmlentities()
has double_encode
argument for this.
If you want to do things like that in regular expressions, then negative assertions come useful:
preg_replace('/&(?!(?:[[:alpha:]][[:alnum:]]*|#(?:[[:digit:]]+|[Xx][[:xdigit:]]+));)/', '&', $txt);
You could always run html_entity_decode
before you run htmlentities
? Works unless you only want to do ampersands (and even then you can play with the charset parameters).
Much easier and faster than a regex.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With