Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Wrap links in <a> tag with regular expression

Tags:

regex

php

I need to wrap all links in the text with "a" tag with regular expression in php, except those, who are already wraped

So i have text:

Some text with html here
http://www.somelink.html
http://www.somelink.com/view/?id=95
<a href="http://anotherlink.html">http://anotherlink.html</a>
<a href="http://anotherlink.html">Title</a>

What i need to get:

Some text with html here
<a href="http://www.somelink.html">http://www.somelink.html</a> <a href="http://www.somelink.com/view/?id=2495">http://www.somelink.com/view/?id=95</a>
<a href="http://anotherlink.html">http://anotherlink.html</a>
<a href="http://anotherlink.html">Title</a>

I can match links with this expression:

(?:(?:https?|ftp):\/\/|www.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]

but it matchs also thouse who are in "a" tags already

like image 342
Grom S Avatar asked Mar 04 '11 15:03

Grom S


2 Answers

You would use a negative lookbehind. The syntax is:

(?<!text)

So in your case, it would be:

(?<!\<a)

Or something close to the above.

like image 149
Anthony Avatar answered Oct 12 '22 02:10

Anthony


For reliability, I would split on <a> tags (inclusive of child content) plus other tags (exclusive of child content) like:

$bits = preg_split('/(<a(?:\s+[^>]*)?>.*?<\/a>|<[a-z][^>]*>)/is', $content, null, PREG_SPLIT_DELIM_CAPTURE);

$reconstructed = '';

foreach ($bits as $bit) {
  if (strpos($bit, '<') !== 0) {//not inside an <a> or within < and > so check for urls
    $bit = link_urls($bit);
  }
  $reconstructed .= $bit;
}
like image 33
Walf Avatar answered Oct 12 '22 01:10

Walf