Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modify URL components in Python 2

Is there a cleaner way to modify some parts of a URL in Python 2?

For example

http://foo/bar -> http://foo/yah

At present, I'm doing this:

import urlparse

url = 'http://foo/bar'

# Modify path component of URL from 'bar' to 'yah'
# Use nasty convert-to-list hack due to urlparse.ParseResult being immutable
parts = list(urlparse.urlparse(url))
parts[2] = 'yah'

url = urlparse.urlunparse(parts)

Is there a cleaner solution?

like image 592
Gareth Stockwell Avatar asked Jun 13 '14 08:06

Gareth Stockwell


People also ask

How do you change the URL in Python?

newurl = url. replace('/f/','/d/').

How do you trim a URL in Python?

sub() method to remove URLs from text, e.g. result = re. sub(r'http\S+', '', my_string) . The re. sub() method will remove any URLs from the string by replacing them with empty strings.

How do you concatenate a URL in Python?

Use the urljoin method from the urllib. parse module to join a base URL with another URLs, e.g. result = urljoin(base_url, path) . The urljoin method constructs a full (absolute) URL by combining a base URL with another URL.

How do I create a Python URL path?

The requests module can help us build the URLS and manipulate the URL value dynamically. Any sub-directory of the URL can be fetched programmatically and then some part of it can be substituted with new values to build new URLs.


1 Answers

Unfortunately, the documentation is out of date; the results produced by urlparse.urlparse() (and urlparse.urlsplit()) use a collections.namedtuple()-produced class as a base.

Don't turn this namedtuple into a list, but make use of the utility method provided for just this task:

parts = urlparse.urlparse(url)
parts = parts._replace(path='yah')

url = parts.geturl()

The namedtuple._replace() method lets you create a new copy with specific elements replaced. The ParseResult.geturl() method then re-joins the parts into a url for you.

Demo:

>>> import urlparse
>>> url = 'http://foo/bar'
>>> parts = urlparse.urlparse(url)
>>> parts = parts._replace(path='yah')
>>> parts.geturl()
'http://foo/yah'

mgilson filed a bug report (with patch) to address the documentation issue.

like image 90
Martijn Pieters Avatar answered Oct 06 '22 01:10

Martijn Pieters