Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP URL to Link with Regex

I know I've seen this done a lot in places, but I need something a little more different than the norm. Sadly When I search this anywhere it gets buried in posts about just making the link into an html tag link. I want the PHP function to strip out the "http://" and "https://" from the link as well as anything after the .* so basically what I am looking for is to turn A into B.

A: http://www.youtube.com/watch?v=spsnQWtsUFM
B: <a href="http://www.youtube.com/watch?v=spsnQWtsUFM">www.youtube.com</a>

If it helps, here is my current PHP regex replace function.

ereg_replace("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]", "<a href=\"\\0\" class=\"bwl\" target=\"_new\">\\0</a>", htmlspecialchars($body, ENT_QUOTES)));

It would probably also be helpful to say that I have absolutely no understanding in regular expressions. Thanks!

EDIT: When I entered a comment like this blahblah https://www.facebook.com/?sk=ff&ap=1 blah I get html like this<a class="bwl" href="blahblah https://www.facebook.com/?sk=ff&amp;ap=1 blah">www.facebook.com</a> which doesn't work at all as it is taking the text around the link with it. It works great if someone only comments a link however. This is when I changed the function to this

preg_replace("#^(.*)//(.*)/(.*)$#",'<a class="bwl" href="\0">\2</a>',  htmlspecialchars($body, ENT_QUOTES));
like image 548
Brian Leishman Avatar asked Jun 18 '11 04:06

Brian Leishman


2 Answers

This is the simples and cleanest way:

$str = 'http://www.youtube.com/watch?v=spsnQWtsUFM';
preg_match("#//(.+?)/#", $str, $matches);

$site_url = $matches[1];

EDIT: I assume that the $str had been checked to be a URL in the first place, so I left that out. Also, I assume that all the URLs will contain either 'http://' or 'https://'. In case the url is formatted like this www.youtube.com/watch?v=spsnQWtsUFM or even youtube.com/watch?v=spsnQWtsUFM, the above regexp won't work!

EDIT2: I'm sorry, I didn't realize that you were trying to replace all strings in a whole test. In that case, this should work the way you want it:

$str = preg_replace('#(\A|[^=\]\'"a-zA-Z0-9])(http[s]?://(.+?)/[^()<>\s]+)#i', '\\1<a href="\\2">\\3</a>', $str);
like image 64
Battle_707 Avatar answered Nov 08 '22 04:11

Battle_707


I am not a regex whizz either,

^(.*)//(.*)/(.*)$
<a href="\1//\2/\3">\2</a>

was what worked for me when I tried to use as find and replace in programmer's notepad.

^(.)// should extract the protocol - referred as \1 in the second line. (.)/ should extract everything till the first / - referred as \2 in the second line. (.*)$ captures everything till the end of the string. - referred as \3 in the second line.


Added later

^(.*)( )(.*)//(.*)/(.*)( )(.*)$
\1\2<a href="\3//\4/\5">\4</a> \7

This should be a bit better, but will only replace just 1 URL

like image 2
Lord Loh. Avatar answered Nov 08 '22 04:11

Lord Loh.