I need to know how to build URLs in python like:
http://subdomain.domain.com?arg1=someargument&arg2=someotherargument
What library would you recommend to use and why? Is there a "best" choice for this kind of library?
Also, can you provide me with sample code to start with using the library?
The requests module can help us build the URLS and manipulate the URL value dynamically. Any sub-directory of the URL can be fetched programmatically and then some part of it can be substituted with new values to build new URLs.
Use the urljoin method from the urllib. parse module to join a base URL with another URLs, e.g. result = urljoin(base_url, path) . The urljoin method constructs a full (absolute) URL by combining a base URL with another URL.
You can encode multiple parameters at once using urllib. parse. urlencode() function. This is a convenience function which takes a dictionary of key value pairs or a sequence of two-element tuples and uses the quote_plus() function to encode every value.
The proper way of encoding a space in the query string of a URL is the + sign. See Wikipedia and the HTML specification. As such urllib. quote_plus() should be used instead when encoding just one key or value, or use urllib.
I would go for Python's urllib
, it's a built-in library.
Python 2
import urllib
url = 'https://example.com/somepage/?'
params = {'var1': 'some data', 'var2': 1337}
print(url + urllib.urlencode(params))
Python 3
import urllib.parse
url = 'https://example.com/somepage/?'
params = {'var1': 'some data', 'var2': 1337}
print(url + urllib.parse.urlencode(params))
Output:
https://example.com/somepage/?var1=some+data&var2=1337
urlparse
in the python standard library is all about building valid urls. Check the documentation of urlparse
Example:
from collections import namedtuple
from urllib.parse import urljoin, urlencode, urlparse, urlunparse
# namedtuple to match the internal signature of urlunparse
Components = namedtuple(
typename='Components',
field_names=['scheme', 'netloc', 'url', 'path', 'query', 'fragment']
)
query_params = {
'param1': 'some data',
'param2': 42
}
url = urlunparse(
Components(
scheme='https',
netloc='example.com',
query=urlencode(query_params),
path='',
url='/',
fragment='anchor'
)
)
print(url)
Output:
https://example.com/?param1=some+data¶m2=42#anchor
Here is an example of using urlparse
to generate URLs. This provides the convenience of adding path to the URL without worrying about checking slashes.
import urllib
def build_url(base_url, path, args_dict):
# Returns a list in the structure of urlparse.ParseResult
url_parts = list(urllib.parse.urlparse(base_url))
url_parts[2] = path
url_parts[4] = urllib.parse.urlencode(args_dict)
return urllib.parse.urlunparse(url_parts)
>>> args = {'arg1': 'value1', 'arg2': 'value2'}
>>> # works with double slash scenario
>>> build_url('http://www.example.com/', '/somepage/index.html', args)
http://www.example.com/somepage/index.html?arg1=value1&arg2=value2
# works without slash
>>> build_url('http://www.example.com', 'somepage/index.html', args)
http://www.example.com/somepage/index.html?arg1=value1&arg2=value2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With