Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Changing hostname in a url

Tags:

python

url

I am trying to use python to change the hostname in a url, and have been playing around with the urlparse module for a while now without finding a satisfactory solution. As an example, consider the url:

https://www.google.dk:80/barbaz

I would like to replace "www.google.dk" with e.g. "www.foo.dk", so I get the following url:

https://www.foo.dk:80/barbaz.

So the part I want to replace is what urlparse.urlsplit refers to as hostname. I had hoped that the result of urlsplit would let me make changes, but the resulting type ParseResult doesn't allow me to. If nothing else I can of course reconstruct the new url by appending all the parts together with +, but this would leave me with some quite ugly code with a lot of conditionals to get "://" and ":" in the correct places.

like image 910
Rikke Bendlin Gammelmark Avatar asked Feb 07 '14 13:02

Rikke Bendlin Gammelmark


People also ask

How do I change my Web host name?

Update the Hostname on the Application ServerOpen the file <OP_HOME>/installer/deploy. properties in a text editor. Update the host property under the [app. serverX] section (where X is the index of the application server that has changed) with the new hostname, and save the file.

Does a URL have a hostname?

The hostname property of the URL interface is a string containing the domain name of the URL.

What is a hostname in a URL?

Hostname. A label assigned to a device connected to a computer network that is used to identify the device in various forms of electronic communication. On the internet, hostnames may have appended the name of a Domain Name System (DNS) domain, separated from the host-specific label by a period.

How do I find the hostname of a URL?

The getHost() method of URL class returns the hostname of the URL. This method will return the IPv6 address enclosed in square brackets ('['and']').


2 Answers

You can use urllib.parse.urlparse function and ParseResult._replace method (Python 3):

>>> import urllib.parse >>> parsed = urllib.parse.urlparse("https://www.google.dk:80/barbaz") >>> replaced = parsed._replace(netloc="www.foo.dk:80") >>> print(replaced) ParseResult(scheme='https', netloc='www.foo.dk:80', path='/barbaz', params='', query='', fragment='') 

If you're using Python 2, then replace urllib.parse with urlparse.

ParseResult is a subclass of namedtuple and _replace is a namedtuple method that:

returns a new instance of the named tuple replacing specified fields with new values

UPDATE:

As @2rs2ts said in the comment netloc attribute includes a port number.

Good news: ParseResult has hostname and port attributes. Bad news: hostname and port are not the members of namedtuple, they're dynamic properties and you can't do parsed._replace(hostname="www.foo.dk"). It'll throw an exception.

If you don't want to split on : and your url always has a port number and doesn't have username and password (that's urls like "https://username:[email protected]:80/barbaz") you can do:

parsed._replace(netloc="{}:{}".format(parsed.hostname, parsed.port)) 
like image 184
Nigel Tufnel Avatar answered Sep 19 '22 12:09

Nigel Tufnel


You can take advantage of urlsplit and urlunsplit from Python's urlparse:

>>> from urlparse import urlsplit, urlunsplit >>> url = list(urlsplit('https://www.google.dk:80/barbaz')) >>> url ['https', 'www.google.dk:80', '/barbaz', '', ''] >>> url[1] = 'www.foo.dk:80' >>> new_url = urlunsplit(url) >>> new_url 'https://www.foo.dk:80/barbaz' 

As the docs state, the argument passed to urlunsplit() "can be any five-item iterable", so the above code works as expected.

like image 25
linkyndy Avatar answered Sep 22 '22 12:09

linkyndy