Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

URL Validation - Accepts URLs without protocols

I've a basic URL validation in my appliction. Right now i'm using the following code.

//validates whether the given value is 
//a valid URL
function validateUrl(value)
{
    var regexp = /(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/
    return regexp.test(value);
}

But right now it is not accepting URLs without the protocol. For ex. if i provide www.google.com it is not accepting it. How can i modify the RegEx to make it accept URLs without protocol?

like image 360
NLV Avatar asked Aug 03 '10 12:08

NLV


2 Answers

Here's a big long regex for matching a URL:

(?i)\b((?:(?:[a-z][\w-]+:)?(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))

The expanded version of that (to help make it understandable):

(?xi)
\b
(                           # Capture 1: entire matched URL
  (?:
    (?:[a-z][\w-]+:)?                # URL protocol and colon
    (?:
      /{1,3}                        # 1-3 slashes
      |                             #   or
      [a-z0-9%]                     # Single letter or digit or '%'
                                    # (Trying not to match e.g. "URI::Escape")
    )
    |                           #   or
    www\d{0,3}[.]               # "www.", "www1.", "www2." … "www999."
    |                           #   or
    [a-z0-9.\-]+[.][a-z]{2,4}/  # looks like domain name followed by a slash
  )
  (?:                           # One or more:
    [^\s()<>]+                      # Run of non-space, non-()<>
    |                               #   or
    \(([^\s()<>]+|(\([^\s()<>]+\)))*\)  # balanced parens, up to 2 levels
  )+
  (?:                           # End with:
    \(([^\s()<>]+|(\([^\s()<>]+\)))*\)  # balanced parens, up to 2 levels
    |                                   #   or
    [^\s`!()\[\]{};:'".,<>?«»“”‘’]        # not a space or one of these punct chars
  )
)

These both come from this page, but modified slightly to make protocol properly optional - you should read that page to help understand what it's doing, and it also has a variant which only matched web-based URLs, which you may want to take a look at too.

like image 184
Peter Boughton Avatar answered Sep 29 '22 20:09

Peter Boughton


Change the regex to:

/((ftp|http|https):\/\/)?(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/
like image 23
tdammers Avatar answered Sep 29 '22 20:09

tdammers