I have found a solution for auto detecting links and putting them in the <a>
tag here: Regex PHP - Auto-detect YouTube, image and "regular" links
Relevant part (I had to move the function outside of the preg_replace_callback
call for compatibility reasons):
function put_url_in_a($arr)
{
if(strpos($arr[0], 'http://') !== 0)
{
$arr[0] = 'http://' . $arr[0];
}
$url = parse_url($arr[0]);
//links
return sprintf('<a href="%1$s">%1$s</a>', $arr[0]);
}
$s = preg_replace_callback('#(?:https?://\S+)|(?:www.\S+)|(?:\S+\.\S+)#', 'put_url_in_a', $s);
This works fine, except when it stumbles upon a url in an tag, which it then ruins (by putting another tag into it). It also ruins embedded media.
Question: How can I exclude HTML tags from being processed by this function using, hopefully, only regular expressions?
One option - if the URL is already in a link it must be prefixed by href='
, so exclude links with a negative lookbehind assertion:
#(?<!href\=['"])(?:https?://\S+)|(?:www.\S+)|(?:\S+\.\S+)#
EDIT: -- actually the above form won't work because the URL match is too general, it'll turn things like ...
into a link, incorrectly. Using my own favourite URL matching scheme seems to work correctly:
$s = preg_replace_callback('#(?<!href\=[\'"])(https?|ftp|file)://[-A-Za-z0-9+&@\#/%()?=~_|$!:,.;]*[-A-Za-z0-9+&@\#/%()=~_|$]#', 'regexp_url_search', $s);
For example: http://codepad.viper-7.com/TukPdY
$s = "The following link should be linkified: http://www.google.com but not this one: <a href='http://www.google.com'>google</a>."`
Becomes:
The following link should be linkified: <a href="http://www.google.com">http://www.google.com</a> but not this one: <a href='http://www.google.com'>google</a>.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With