Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP: Regular Expression to get a URL from a string [duplicate]

Tags:

regex

url

php

Possible Duplicates:
Identifying if a URL is present in a string
Php parse links/emails

I'm working on some PHP code which takes input from various sources and needs to find the URLs and save them somewhere. The kind of input that needs to be handled is as follows:

http://www.youtube.com/watch?v=IY2j_GPIqRA
Try google: http://google.com! (note exclamation mark is not part of the URL)
Is http://somesite.com/ down for anyone else?

Output:

http://www.youtube.com/watch?v=IY2j_GPIqRA
http://google.com
http://somesite.com/

I've already borrowed one regular expression from the internet which works, but unfortunately wipes the query string out - not good!

Any help putting together a regular expression, or perhaps another solution to this problem, would be appreciated.

like image 961
Matthew Iselin Avatar asked Dec 12 '22 22:12

Matthew Iselin


1 Answers

Jan Goyvaerts, Regex Guru, has addressed this issue in his blog. There are quite a few caveats, for example extracting URLs inside parentheses correctly. What you need exactly depends on the "quality" of your input data.

For the examples you provided, \b(?:(?:https?|ftp|file)://|www\.|ftp\.)[-A-Z0-9+&@#/%=~_|$?!:,.]*[A-Z0-9+&@#/%=~_|$] works when used in case-insensitive mode.

So to find all matches in a multiline string, use

preg_match_all('/\b(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)[-A-Z0-9+&@#\/%=~_|$?!:,.]*[A-Z0-9+&@#\/%=~_|$]/i', $subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];
like image 132
Tim Pietzcker Avatar answered Mar 03 '23 19:03

Tim Pietzcker