I'm trying to get the domain of a given URL. For example http://www.facebook.com/someuser/
will return facebook.com
. The given URL can be on these formats:
https://www.facebook.com/someuser
(www. is optional, but should be ignored)www.facebook.com/someuser
(http:// is not required)facebook.com/someuser
http://someuser.tumblr.com
-> this has to return tumblr.com
onlyI wrote this regex:
/(?: \.|\/{2})(?: www\.)?([^\/]*)/i
But it does not work as I expect.
I can do this in parts:
http://
and https://
, if present on string, with string.delete "/https?:\/\//i"
.www.
with string.delete "/www\./i"
./(\w+\.\w+)+/i
But this won't work with subdomains. String for testing:
https://www.facebook.com/username
http://last.fm/user/username
www.google.com
facebook.com/username
http://sub.tumblr.com/
sub.tumblr.com
I need this to work with the minimum memory and processing coast as possible.
Any ideas?
Why don't you just use the URI class to do this?
URI.parse( your_uri ).host
And you're done.
Just one thing, if there's no "http://" or "https://" at the beginning of the url, you'll have to add one, or the parse method is not going to give you a host (it's going to be nil).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With