- (BOOL) validateUrl: (NSString *) candidate {
NSString *urlRegEx =
@"(http|https)://((\\w)*|([0-9]*)|([-|_])*)+([\\.|/]((\\w)*|([0-9]*)|([-|_])*))+";
NSPredicate *urlTest = [NSPredicate predicateWithFormat:@"SELF MATCHES %@", urlRegEx];
if( [urlTest evaluateWithObject:candidate]
||[candidate containsString:@".com"]
||[candidate containsString:@".net"]
||[candidate containsString:@".org"]
||[candidate containsString:@".cn"]
||[candidate containsString:@".jp"]
)
{
return TRUE;
}
return FALSE;
}
This is a long list of URL domain name, ".com", ".net", ".org", and so on. People don't need to input "http" in the front or address bar.
So how does Chrome address bar determine it's an URL or a search string?
If I input "a.fa", it's not an URL.
"a a.com",it's a search string.
"a.mobi/aaa", it's an URL.
It would be possible to find the answer through Chromium, as funroll mentioned—but here's the basic idea of what's going on, at least according to my testing.
A string entered into the 'omni box' is determined to be a URL if it follows the format of:
[protocol][subdomains].[subdomains].[domain name].[tld]
Where subdomains (which are optional, of course) and the domain name both contain only letters (for Chrome, this seems to include accented letters), numbers, spaces, and hyphens, and the TLD/Top Level Domain is from an approved list—.com, .net, etc—unless a protocol is specified, in which case any TLD is treated as valid. Protocols also come from a set list, but can be in pretty much any format with a colon following any number of slashes. If the protocol is not part of the set list, the entire URL is treated as a search instead.
If there is a slash after a string in the above URL format (e.g., stackoverflow.com/), then anything afterwards works.
Alternatively, if a slash occurs at the start of the string, Chrome treats it as a URL as well (with the file://
protocol).
Examples of valid URLs (according to Chrome):
- stackoverflow.com
- abc.stackoverflow.com
- abc.abc.abc.abc.stackoverflow.com
- stáckoverflow.com (this changes the URL, but is allowed—try it!)
- stack-overflow.com
- -stackoverflow.com (might not even be a legal domain name, but it works)
- 4stackoverflow.com
- stackoverflow.com
- stackoverflow.com/not valid characters !@#$^æ
- [http]://stackoverflow.com (the brackets aren't legal, but I can't include the link otherwise)
- [http]:////stackoverflow.com
- [http]:stackoverflow.com
- [http]:stackoverflow.mynewtld
Examples of invalid URLs:
- stack overflow.com
- stackoverflow*.com
- stack/overflow.com
- stackoverflow.mynewtld
And, well, just about everything else.
Let's just hope there's a library out there somewhere to do all this instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With