Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

urlparse.urlparse returning 3 '/' instead of 2 after scheme

I'd like to add the 'http' scheme name in front of a given url string if it's missing. Otherwise, leave the url alone so I thought urlparse was the right way to do this. But whenever there's no scheme and I use get url, I get /// instead of '//' between the scheme and domain.

>>> t = urlparse.urlparse('www.example.com', 'http')
>>> t.geturl()
'http:///www.example.com' # three ///

How do I convert this url so it actually looks like:

'http://www.example.com' # two //
like image 347
Dan Holman Avatar asked Sep 02 '11 21:09

Dan Holman


People also ask

What does Urlparse return?

The url. parse() method takes a URL string, parses it, and it will return a URL object with each part of the address as properties.

What does urlsplit do?

The urlsplit() function is an alternative to urlparse(). It behaves a little different, because it does not split the parameters from the URL. This is useful for URLs following RFC 2396, which supports parameters for each segment of the path.

How does Urlparse work in Python?

The urlparse module contains functions to process URLs, and to convert between URLs and platform-specific filenames. Example 7-16 demonstrates. A common use is to split an HTTP URL into host and path components (an HTTP request involves asking the host to return data identified by the path), as shown in Example 7-17.


1 Answers

Short answer (but it's a bit tautological):

>>> urlparse.urlparse("http://www.example.com").geturl()
'http://www.example.com'

In your example code, the hostname is parsed as a path not a network location:

>>> urlparse.urlparse("www.example.com/go")
ParseResult(scheme='', netloc='', path='www.example.com/go', params='', \
    query='', fragment='')

>>> urlparse.urlparse("http://www.example.com/go")
ParseResult(scheme='http', netloc='www.example.com', path='/go', params='', \
    query='', fragment='')
like image 181
miku Avatar answered Sep 22 '22 20:09

miku