How to join components of a path when you are constructing a URL in Python

People also ask

How do you join a path in Python?

path. join() method in Python join one or more path components intelligently. This method concatenates various path components with exactly one directory separator ('/') following each non-empty part except the last path component.

How do you join a URL?

Use the urljoin method from the urllib. parse module to join a base URL with another URLs, e.g. result = urljoin(base_url, path) . The urljoin method constructs a full (absolute) URL by combining a base URL with another URL. Copied!

You can use urllib.parse.urljoin:

>>> from urllib.parse import urljoin
>>> urljoin('/media/path/', 'js/foo.js')
'/media/path/js/foo.js'

But beware:

>>> urljoin('/media/path', 'js/foo.js')
'/media/js/foo.js'
>>> urljoin('/media/path', '/js/foo.js')
'/js/foo.js'

The reason you get different results from /js/foo.js and js/foo.js is because the former begins with a slash which signifies that it already begins at the website root.

On Python 2, you have to do

from urlparse import urljoin

Since, from the comments the OP posted, it seems he doesn't want to preserve "absolute URLs" in the join (which is one of the key jobs of urlparse.urljoin;-), I'd recommend avoiding that. os.path.join would also be bad, for exactly the same reason.

So, I'd use something like '/'.join(s.strip('/') for s in pieces) (if the leading / must also be ignored -- if the leading piece must be special-cased, that's also feasible of course;-).

Like you say, os.path.join joins paths based on the current os. posixpath is the underlying module that is used on posix systems under the namespace os.path:

>>> os.path.join is posixpath.join
True
>>> posixpath.join('/media/', 'js/foo.js')
'/media/js/foo.js'

So you can just import and use posixpath.join instead for urls, which is available and will work on any platform.

Edit: @Pete's suggestion is a good one, you can alias the import for increased readability

from posixpath import join as urljoin

Edit: I think this is made clearer, or at least helped me understand, if you look into the source of os.py (the code here is from Python 2.7.11, plus I've trimmed some bits). There's conditional imports in os.py that picks which path module to use in the namespace os.path. All the underlying modules (posixpath, ntpath, os2emxpath, riscospath) that may be imported in os.py, aliased as path, are there and exist to be used on all systems. os.py is just picking one of the modules to use in the namespace os.path at run time based on the current OS.

# os.py
import sys, errno

_names = sys.builtin_module_names

if 'posix' in _names:
    # ...
    from posix import *
    # ...
    import posixpath as path
    # ...

elif 'nt' in _names:
    # ...
    from nt import *
    # ...
    import ntpath as path
    # ...

elif 'os2' in _names:
    # ...
    from os2 import *
    # ...
    if sys.version.find('EMX GCC') == -1:
        import ntpath as path
    else:
        import os2emxpath as path
        from _emx_link import link
    # ...

elif 'ce' in _names:
    # ...
    from ce import *
    # ...
    # We can use the standard Windows path.
    import ntpath as path

elif 'riscos' in _names:
    # ...
    from riscos import *
    # ...
    import riscospath as path
    # ...

else:
    raise ImportError, 'no os specific module found'

This does the job nicely:

def urljoin(*args):
    """
    Joins given arguments into an url. Trailing but not leading slashes are
    stripped for each argument.
    """

    return "/".join(map(lambda x: str(x).rstrip('/'), args))

The basejoin function in the urllib package might be what you're looking for.

basejoin = urljoin(base, url, allow_fragments=True)
    Join a base URL and a possibly relative URL to form an absolute
    interpretation of the latter.

Edit: I didn't notice before, but urllib.basejoin seems to map directly to urlparse.urljoin, making the latter preferred.

Using furl, pip install furl it will be:

 furl.furl('/media/path/').add(path='js/foo.js')

Related questions
                            
                                Accessing elements of Python dictionary by index
                            
                                "Too many values to unpack" Exception
                            
                                How to call Base Class's __init__ method from the child class? [duplicate]
                            
                                Using .sort with PyMongo
                            
                                Instance variables vs. class variables in Python
                            
                                Python: print a generator expression?
                            
                                surface plots in matplotlib
                            
                                Implementing slicing in __getitem__
                            
                                How to write setup.py to include a Git repository as a dependency
                            
                                Finding a substring within a list in Python [duplicate]
                            
                                Activate a virtualenv via fabric as deploy user
                            
                                Modulus % in Django template
                            
                                How to save and load numpy.array() data properly?
                            
                                How to run Django's test database only in memory?
                            
                                Python - TypeError: Object of type 'int64' is not JSON serializable
                            
                                Show the progress of a Python multiprocessing pool imap_unordered call?
                            
                                Adding value labels on a matplotlib bar chart
                            
                                Sorting Python list based on the length of the string
                            
                                "Private" (implementation) class in Python
                            
                                How do I find the maximum of 2 numbers?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to join components of a path when you are constructing a URL in Python

Tags:

python

url

People also ask

Recent Activity

Donate For Us