path. join() method in Python join one or more path components intelligently. This method concatenates various path components with exactly one directory separator ('/') following each non-empty part except the last path component.
Use the urljoin method from the urllib. parse module to join a base URL with another URLs, e.g. result = urljoin(base_url, path) . The urljoin method constructs a full (absolute) URL by combining a base URL with another URL. Copied!
You can use urllib.parse.urljoin
:
>>> from urllib.parse import urljoin
>>> urljoin('/media/path/', 'js/foo.js')
'/media/path/js/foo.js'
But beware:
>>> urljoin('/media/path', 'js/foo.js')
'/media/js/foo.js'
>>> urljoin('/media/path', '/js/foo.js')
'/js/foo.js'
The reason you get different results from /js/foo.js
and js/foo.js
is because the former begins with a slash which signifies that it already begins at the website root.
On Python 2, you have to do
from urlparse import urljoin
Since, from the comments the OP posted, it seems he doesn't want to preserve "absolute URLs" in the join (which is one of the key jobs of urlparse.urljoin
;-), I'd recommend avoiding that. os.path.join
would also be bad, for exactly the same reason.
So, I'd use something like '/'.join(s.strip('/') for s in pieces)
(if the leading /
must also be ignored -- if the leading piece must be special-cased, that's also feasible of course;-).
Like you say, os.path.join
joins paths based on the current os. posixpath
is the underlying module that is used on posix systems under the namespace os.path
:
>>> os.path.join is posixpath.join
True
>>> posixpath.join('/media/', 'js/foo.js')
'/media/js/foo.js'
So you can just import and use posixpath.join
instead for urls, which is available and will work on any platform.
Edit: @Pete's suggestion is a good one, you can alias the import for increased readability
from posixpath import join as urljoin
Edit: I think this is made clearer, or at least helped me understand, if you look into the source of os.py
(the code here is from Python 2.7.11, plus I've trimmed some bits). There's conditional imports in os.py
that picks which path module to use in the namespace os.path
. All the underlying modules (posixpath
, ntpath
, os2emxpath
, riscospath
) that may be imported in os.py
, aliased as path
, are there and exist to be used on all systems. os.py
is just picking one of the modules to use in the namespace os.path
at run time based on the current OS.
# os.py
import sys, errno
_names = sys.builtin_module_names
if 'posix' in _names:
# ...
from posix import *
# ...
import posixpath as path
# ...
elif 'nt' in _names:
# ...
from nt import *
# ...
import ntpath as path
# ...
elif 'os2' in _names:
# ...
from os2 import *
# ...
if sys.version.find('EMX GCC') == -1:
import ntpath as path
else:
import os2emxpath as path
from _emx_link import link
# ...
elif 'ce' in _names:
# ...
from ce import *
# ...
# We can use the standard Windows path.
import ntpath as path
elif 'riscos' in _names:
# ...
from riscos import *
# ...
import riscospath as path
# ...
else:
raise ImportError, 'no os specific module found'
This does the job nicely:
def urljoin(*args):
"""
Joins given arguments into an url. Trailing but not leading slashes are
stripped for each argument.
"""
return "/".join(map(lambda x: str(x).rstrip('/'), args))
The basejoin function in the urllib package might be what you're looking for.
basejoin = urljoin(base, url, allow_fragments=True)
Join a base URL and a possibly relative URL to form an absolute
interpretation of the latter.
Edit: I didn't notice before, but urllib.basejoin seems to map directly to urlparse.urljoin, making the latter preferred.
Using furl, pip install furl
it will be:
furl.furl('/media/path/').add(path='js/foo.js')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With