How does Facebook's URL matching algorithm work? [duplicate]

Question

You know how if you go to facebook.com and enter a URL into the status update textarea it will automatically be detected, and Facebook will display a little snapshot of data from that URL/link? Facebook doesn't even care if you enter a URL with or without a protocol like http://.

I'm looking to replicate this behavior. Right now I have this regular expression:

((?:https?://)?)((?:[a-zA-Z0-9\-]+\.)+(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2})(?:[a-z0-9\._/~%\-\+&\#\?!=@]*)?(?:#?(?:[w]+)?)?)

And I use it to match URLs entered in a textarea. However, it has false positives; it'll match document.write(foo) which clearly isn't a URL.

Facebook doesn't seem to have this issue. In fact, I can type "yahoo.com " into Facebook's textarea and it'll recognize it as a URL. But if I type "example.com " it wont recognize it. So, this means Facebook must be doing something more than just regular expression matching. Or am I wrong about this?

In conclusion, I want to know what Facebook is doing, and I want to know how I can replicate it. Any ideas, tips or solutions is very much appreciated.

Thanks for reading.

Sedecimdies · Accepted Answer

the simplest of regex to match any url is

[a-z_\.\-0-9]+\.[a-z]+

if this is present, do a lookup on the result. if the result fails, then it wasnt a url.

There is no save way to tell if a url is a url if its presented to you without the http:// prefix.

the regex will match stackoverflow.com in the following string ;

I always use stackoverflow.com to find the answers i need.

if you try "http://www." & regex.match.value you should get a valid url... or not.. You wont know until you do a lookup.

How does Facebook's URL matching algorithm work? [duplicate]

Tags:

javascript

regex

facebook

Sam

1 Answers

Sedecimdies

Recent Activity

Donate For Us

How does Facebook's URL matching algorithm work? [duplicate]

Tags:

javascript

regex

facebook

Sam

1 Answers

Sedecimdies

Related questions

Recent Activity

Donate For Us