Get protocol + host name from URL

People also ask

How do I find the protocol of a URL?

The getProtocol() function is a part of URL class. The function getProtocol() returns the Protocol of a specified URL.

How do I find the hostname of a URL?

The getHost() method of URL class returns the hostname of the URL. This method will return the IPv6 address enclosed in square brackets ('['and']').

How can we get hostname and port information in JavaScript?

If you only want to return the hostname value (excluding the port number), use the window. location. hostname method instead. This will return a string value containing the hostname and, if the port value is non-empty, a : symbol along with the port number of the URL.

How do I get the URL from the console?

Answer: Use the window. location. href Property location. href property to get the entire URL of the current page which includes host name, query string, fragment identifier, etc. The following example will display the current url of the page on click of the button.

You should be able to do it with urlparse (docs: python2, python3):

from urllib.parse import urlparse
# from urlparse import urlparse  # Python 2
parsed_uri = urlparse('http://stackoverflow.com/questions/1234567/blah-blah-blah-blah' )
result = '{uri.scheme}://{uri.netloc}/'.format(uri=parsed_uri)
print(result)

# gives
'http://stackoverflow.com/'

https://github.com/john-kurkowski/tldextract

This is a more verbose version of urlparse. It detects domains and subdomains for you.

From their documentation:

>>> import tldextract
>>> tldextract.extract('http://forums.news.cnn.com/')
ExtractResult(subdomain='forums.news', domain='cnn', suffix='com')
>>> tldextract.extract('http://forums.bbc.co.uk/') # United Kingdom
ExtractResult(subdomain='forums', domain='bbc', suffix='co.uk')
>>> tldextract.extract('http://www.worldbank.org.kg/') # Kyrgyzstan
ExtractResult(subdomain='www', domain='worldbank', suffix='org.kg')

ExtractResult is a namedtuple, so it's simple to access the parts you want.

>>> ext = tldextract.extract('http://forums.bbc.co.uk')
>>> ext.domain
'bbc'
>>> '.'.join(ext[:2]) # rejoin subdomain and domain
'forums.bbc'

Python3 using urlsplit:

from urllib.parse import urlsplit
url = "http://stackoverflow.com/questions/9626535/get-domain-name-from-url"
base_url = "{0.scheme}://{0.netloc}/".format(urlsplit(url))
print(base_url)
# http://stackoverflow.com/

>>> import urlparse
>>> url = 'http://stackoverflow.com/questions/1234567/blah-blah-blah-blah'
>>> urlparse.urljoin(url, '/')
'http://stackoverflow.com/'

Pure string operations :):

>>> url = "http://stackoverflow.com/questions/9626535/get-domain-name-from-url"
>>> url.split("//")[-1].split("/")[0].split('?')[0]
'stackoverflow.com'
>>> url = "stackoverflow.com/questions/9626535/get-domain-name-from-url"
>>> url.split("//")[-1].split("/")[0].split('?')[0]
'stackoverflow.com'
>>> url = "http://foo.bar?haha/whatever"
>>> url.split("//")[-1].split("/")[0].split('?')[0]
'foo.bar'

That's all, folks.

Related questions
                            
                                How to load a tsv file into a Pandas DataFrame?
                            
                                How do I get a list of column names from a psycopg2 cursor?
                            
                                How to choose an AWS profile when using boto3 to connect to CloudFront
                            
                                How to erase the file contents of text file in Python?
                            
                                How to print a dictionary line by line in Python?
                            
                                APT command line interface-like yes/no input?
                            
                                How can I scroll a web page using selenium webdriver in python?
                            
                                Parse a .py file, read the AST, modify it, then write back the modified source code
                            
                                NumPy or Pandas: Keeping array type as integer while having a NaN value
                            
                                Import local function from a module housed in another directory with relative imports in Jupyter Notebook using Python 3
                            
                                What is `1..__truediv__` ? Does Python have a .. ("dot dot") notation syntax?
                            
                                Get last n lines of a file, similar to tail
                            
                                How to initialize weights in PyTorch?
                            
                                How to write to a file, using the logging Python module?
                            
                                Label axes on Seaborn Barplot
                            
                                How do you divide each element in a list by an int?
                            
                                Numpy first occurrence of value greater than existing value
                            
                                tqdm in Jupyter Notebook prints new progress bars repeatedly
                            
                                Open S3 object as a string with Boto3
                            
                                Split list into smaller lists (split in half)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Get protocol + host name from URL

Tags:

python

django

People also ask

Python3 using urlsplit:

Recent Activity

Donate For Us