I'm curious if there's a simpler way to remove a particular parameter from a url. What I came up with is the following. This seems a bit verbose. Libraries to use or a more pythonic version appreciated.
parsed = urlparse(url)
if parsed.query != "":
params = dict([s.split("=") for s in parsed.query.split("&")])
if params.get("page"):
del params["page"]
url = urlunparse((parsed.scheme,
None,
parsed.path,
None,
urlencode(params.items()),
parsed.fragment,))
parsed = urlparse(url)
Practical Data Science using PythonThe requests module can help us build the URLS and manipulate the URL value dynamically. Any sub-directory of the URL can be fetched programmatically and then some part of it can be substituted with new values to build new URLs.
To send parameters in URL, write all parameter key:value pairs to a dictionary and send them as params argument to any of the GET, POST, PUT, HEAD, DELETE or OPTIONS request. then https://somewebsite.com/?param1=value1¶m2=value2 would be our final url.
How do you encode a URL in Python? In Python 3+, You can URL encode any string using the quote() function provided by urllib. parse package. The quote() function by default uses UTF-8 encoding scheme.
Method 1 – Using the += Operator In the code above, we start by defining two strings. The first holding the domain and the other holding the extension. We then use the += operator to append the first string into the second string and print it out.
Method #1 : Using split() ' and return the first part of split for result.
To check whether the string entered is a valid URL or not we use the validators module in Python. When we pass the string to the method url() present in the module it returns true(if the string is URL) and ValidationFailure(func=url, …) if URL is invalid.
netloc : Contains the network location - which includes the domain itself (and subdomain if present), the port number, along with an optional credentials in form of username:password . Together it may take form of username:[email protected]:80 .
Use urlparse.parse_qsl()
to crack the query string. You can filter this in one go:
params = [(k,v) for (k,v) in parse_qsl(parsed.query) if k != 'page']
I've created a small helper class to represent a url in a structured way:
import cgi, urllib, urlparse
class Url(object):
def __init__(self, url):
"""Construct from a string."""
self.scheme, self.netloc, self.path, self.params, self.query, self.fragment = urlparse.urlparse(url)
self.args = dict(cgi.parse_qsl(self.query))
def __str__(self):
"""Turn back into a URL."""
self.query = urllib.urlencode(self.args)
return urlparse.urlunparse((self.scheme, self.netloc, self.path, self.params, self.query, self.fragment))
Then you can do:
u = Url(url)
del u.args['page']
url = str(u)
More about this: Web development peeve.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With