I'm not very good at regular expressions at all.
I've been using a lot of framework code to date, but I'm unable to find one that is able to match a URL like http://www.example.com/etcetc
, but it is also is able to catch something like www.example.com/etcetc
and example.com/etcetc
.
@:%_\+~#= , to match the domain/sub domain name. In this solution query string parameters are also taken care. If you are not using RegEx , then from the expression replace \\ by \ . Hope this helps.
[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9.
Literal Characters and Sequences For instance, you might need to search for a dollar sign ("$") as part of a price list, or in a computer program as part of a variable name. Since the dollar sign is a metacharacter which means "end of line" in regex, you must escape it with a backslash to use it literally.
Most characters, including all letters ( a-z and A-Z ) and digits ( 0-9 ), match itself. For example, the regex x matches substring "x" ; z matches "z" ; and 9 matches "9" . Non-alphanumeric characters without special meaning in regex also matches itself. For example, = matches "=" ; @ matches "@" .
For matching all kinds of URLs, the following code should work:
<?php $regex = "((https?|ftp)://)?"; // SCHEME $regex .= "([a-z0-9+!*(),;?&=$_.-]+(:[a-z0-9+!*(),;?&=$_.-]+)?@)?"; // User and Pass $regex .= "([a-z0-9\-\.]*)\.(([a-z]{2,4})|([0-9]{1,3}\.([0-9]{1,3})\.([0-9]{1,3})))"; // Host or IP $regex .= "(:[0-9]{2,5})?"; // Port $regex .= "(/([a-z0-9+$_%-]\.?)+)*/?"; // Path $regex .= "(\?[a-z+&\$_.-][a-z0-9;:@&%=+/$_.-]*)?"; // GET Query $regex .= "(#[a-z_.-][a-z0-9+$%_.-]*)?"; // Anchor ?>
Then, the correct way to check against the regex is as follows:
<?php if(preg_match("~^$regex$~i", 'www.example.com/etcetc', $m)) var_dump($m); if(preg_match("~^$regex$~i", 'http://www.example.com/etcetc', $m)) var_dump($m); ?>
Courtesy: Comments made by splattermania in the PHP manual: http://php.net/manual/en/function.preg-match.php
RegEx Demo in regex101
This worked for me in all cases I had tested:
$url_pattern = '/((http|https)\:\/\/)?[a-zA-Z0-9\.\/\?\:@\-_=#]+\.([a-zA-Z0-9\&\.\/\?\:@\-_=#])*/';
Tests:
http://test.test-75.1474.stackoverflow.com/ https://www.stackoverflow.com https://www.stackoverflow.com/ http://wwww.stackoverflow.com/ http://wwww.stackoverflow.com http://test.test-75.1474.stackoverflow.com/ http://www.stackoverflow.com http://www.stackoverflow.com/ stackoverflow.com/ stackoverflow.com http://www.example.com/etcetc www.example.com/etcetc example.com/etcetc user:[email protected]/etcetc example.com/etcetc?query=aasd example.com/etcetc?query=aasd&dest=asds http://stackoverflow.com/questions/6427530/regular-expression-pattern-to-match-url-with-or-without-http-www http://stackoverflow.com/questions/6427530/regular-expression-pattern-to-match-url-with-or-without-http-www/
Every valid Internet URL has at least one dot, so the above pattern will simply try to find any at least two strings chained by a dot and has valid characters that URL may have.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With