Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if a URL is valid

Tags:

ruby

How can I check if a string is a valid URL?

For example:

http://hello.it => yes http:||bra.ziz, => no 

If this is a valid URL how can I check if this is relative to a image file?

like image 324
Luca Romagnoli Avatar asked Nov 26 '09 21:11

Luca Romagnoli


People also ask

What does a valid URL look like?

Most web browsers display the URL of a web page above the page in an address bar. A typical URL could have the form http://www.example.com/index.html , which indicates a protocol ( http ), a hostname ( www.example.com ), and a file name ( index. html ).

How do you check if a URL is valid or not in node?

Other easy way is use Node. JS DNS module. The DNS module provides a way of performing name resolutions, and with it you can verify if the url is valid or not.

What makes something a valid URL?

A URL is a valid URL if at least one of the following conditions holds: The URL is a valid URI reference [RFC3986]. The URL is a valid IRI reference and it has no query component. [RFC3987]

What is an invalid URL?

What is an invalid URL? A URL or Uniform Resource Locator is the web address of a specific webpage. If your browser says the URL is invalid, this can often mean one of five things: The page doesn't exist — it has been removed or deleted, or the owner completely shut down the website.


2 Answers

Notice:

As pointed by @CGuess, there's a bug with this issue and it's been documented for over 9 years now that validation is not the purpose of this regular expression (see https://bugs.ruby-lang.org/issues/6520).




Use the URI module distributed with Ruby:

require 'uri'  if url =~ URI::regexp     # Correct URL end 

Like Alexander Günther said in the comments, it checks if a string contains a URL.

To check if the string is a URL, use:

url =~ /\A#{URI::regexp}\z/ 

If you only want to check for web URLs (http or https), use this:

url =~ /\A#{URI::regexp(['http', 'https'])}\z/ 
like image 74
Mikael S Avatar answered Oct 09 '22 11:10

Mikael S


Similar to the answers above, I find using this regex to be slightly more accurate:

URI::DEFAULT_PARSER.regexp[:ABS_URI] 

That will invalidate URLs with spaces, as opposed to URI.regexp which allows spaces for some reason.

I have recently found a shortcut that is provided for the different URI rgexps. You can access any of URI::DEFAULT_PARSER.regexp.keys directly from URI::#{key}.

For example, the :ABS_URI regexp can be accessed from URI::ABS_URI.

like image 31
jonuts Avatar answered Oct 09 '22 13:10

jonuts