I am aware that regex is not ideal for use with HTML strings and I have looked at the PHP Simple HTML DOM Parser but still believe this is the way to go. All the HTML tags will be generated by my forum software so they will be consistent and valid HTML.
What I am trying to do is make a plugin that will find a list of keywords (or phrases) in a string of HTML and replace them with a link I specify. For example if someone types:
I use Amazon for that.
it would replace it with:
I use <a href="http://www.amazon.com">Amazon</a> for that.
The problem is of course is that if "amazon" is in the URL it would also get replaced. I solved that issue with a callback function found on this site, slightly modified.
But now I still have an issue, it still replaces words between opening and closing tags.
<a href="http://www.amazon.com">My Amazon Link</a>
It will match the "Amazon" in "My Amazon Link"
What I really need is a regex to match say "amazon" anywhere except between <a href
and </a>
Any ideas?
Using the DOM would certainly be preferable.
However, you might get away with this:
$result = preg_replace('%Amazon(?![^<]*</a>)%i', '<a href="http://www.amazon.com">Amazon</a>', $subject);
It matches Amazon
only if
</a>
tag, <a>
tags.It will therefore change this:
I use Amazon for that.
I use <a href="http://www.amazon.com">Amazon</a> for that.
<a href="http://www.amazon.com">My Amazon Link</a>
It will match the "Amazon" in "My Amazon Link"
into this:
I use <a href="http://www.amazon.com">Amazon</a> for that.
I use <a href="http://www.amazon.com">Amazon</a> for that.
<a href="http://www.amazon.com">My Amazon Link</a>
It will match the "<a href="http://www.amazon.com">Amazon</a>" in "My <a href="http://www.amazon.com">Amazon</a> Link"
Don't do this. You cannot reliably do this with Regex, no matter how consistent your HTML is.
Something like this should work, however:
<?php
$dom = new DOMDocument;
$dom->load('test.xml');
$x = new DOMXPath($dom);
$nodes = $x->query("//text()[contains(., 'Amazon')][not(ancestor::a)]");
foreach ($nodes as $node) {
while (false !== strpos($node->nodeValue, 'Amazon')) {
$word = $node->splitText(strpos($node->nodeValue, 'Amazon'));
$after = $word->splitText(6);
$link = $dom->createElement('a');
$link->setAttribute('href', 'http://www.amazon.com');
$word->parentNode->replaceChild($link, $word);
$link->appendChild($word);
$node = $after;
}
}
$html = $dom->saveHTML();
echo $html;
It's verbose, but it will actually work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With