Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a better way to write this URL Manipulation in Python?

I'm curious if there's a simpler way to remove a particular parameter from a url. What I came up with is the following. This seems a bit verbose. Libraries to use or a more pythonic version appreciated.

parsed = urlparse(url)
if parsed.query != "":
    params = dict([s.split("=") for s in parsed.query.split("&")])
    if params.get("page"):
        del params["page"]
    url = urlunparse((parsed.scheme,
                      None,
                      parsed.path,
                      None,
                      urlencode(params.items()),
                      parsed.fragment,))
    parsed = urlparse(url)
like image 926
dnolen Avatar asked May 20 '10 12:05

dnolen


People also ask

How do I create a custom URL in Python?

Practical Data Science using PythonThe requests module can help us build the URLS and manipulate the URL value dynamically. Any sub-directory of the URL can be fetched programmatically and then some part of it can be substituted with new values to build new URLs.

How do you parameterize a URL in Python?

To send parameters in URL, write all parameter key:value pairs to a dictionary and send them as params argument to any of the GET, POST, PUT, HEAD, DELETE or OPTIONS request. then https://somewebsite.com/?param1=value1&param2=value2 would be our final url.

How do I encode a URL in Python?

How do you encode a URL in Python? In Python 3+, You can URL encode any string using the quote() function provided by urllib. parse package. The quote() function by default uses UTF-8 encoding scheme.

How do I add strings to a URL in Python?

Method 1 – Using the += Operator In the code above, we start by defining two strings. The first holding the domain and the other holding the extension. We then use the += operator to append the first string into the second string and print it out.

How do you split a URL in Python?

Method #1 : Using split() ' and return the first part of split for result.

How do you validate a URL in Python?

To check whether the string entered is a valid URL or not we use the validators module in Python. When we pass the string to the method url() present in the module it returns true(if the string is URL) and ValidationFailure(func=url, …) if URL is invalid.

What is Netloc in Python?

netloc : Contains the network location - which includes the domain itself (and subdomain if present), the port number, along with an optional credentials in form of username:password . Together it may take form of username:[email protected]:80 .


2 Answers

Use urlparse.parse_qsl() to crack the query string. You can filter this in one go:

params = [(k,v) for (k,v) in parse_qsl(parsed.query) if k != 'page']
like image 65
Marcelo Cantos Avatar answered Feb 02 '23 07:02

Marcelo Cantos


I've created a small helper class to represent a url in a structured way:

import cgi, urllib, urlparse

class Url(object):
    def __init__(self, url):
        """Construct from a string."""
        self.scheme, self.netloc, self.path, self.params, self.query, self.fragment = urlparse.urlparse(url)
        self.args = dict(cgi.parse_qsl(self.query))

    def __str__(self):
        """Turn back into a URL."""
        self.query = urllib.urlencode(self.args)
        return urlparse.urlunparse((self.scheme, self.netloc, self.path, self.params, self.query, self.fragment))

Then you can do:

u = Url(url)
del u.args['page']
url = str(u)

More about this: Web development peeve.

like image 33
Ned Batchelder Avatar answered Feb 02 '23 05:02

Ned Batchelder