Combining a url with urlunparse

Tags:

I'm writing something to 'clean' a URL. In this case all I'm trying to do is return a faked scheme as urlopen won't work without one. However, if I test this with www.python.org It'll return http:///www.python.org. Does anyone know why the extra /, and is there a way to return this without it?

Click to copy

def FixScheme(website):

   from urlparse import urlparse, urlunparse

   scheme, netloc, path, params, query, fragment = urlparse(website)

   if scheme == '':
       return urlunparse(('http', netloc, path, params, query, fragment))
   else:
       return website

382

asked Sep 26 '10 14:09

Ben

2 Answers

Problem is that in parsing the very incomplete URL www.python.org, the string you give is actually taken as the path component of the URL, with the netloc (network location) one being empty as well as the scheme. For defaulting the scheme you can actually pass a second parameter scheme to urlparse (simplifying your logic) but that does't help with the "empty netloc" problem. So you need some logic for that case, e.g.

Click to copy

if not netloc:
    netloc, path = path, ''

answered Oct 07 '22 02:10

Alex Martelli

It's because urlparse is interpreting "www.python.org" not as the hostname (netloc), but as the path, just as a browser would if it encountered that string in an href attribute. Then urlunparse seems to interpret scheme "http" specially. If you put in "x" as the scheme, you'll get "x:www.python.org".

I don't know what range of inputs you're dealing with, but it looks like you might not want urlparse and urlunparse.

answered Oct 07 '22 03:10

Ned Batchelder

Related questions
                            
                                PyScripter - change highlighting options/color scheme Python
                            
                                Allowing users to delete their own comments in Django
                            
                                Overwrite the Soap Envelope in Suds python
                            
                                Python: unable to inherit from a C extension
                            
                                python variable scope
                            
                                Reading a Delphi binary file in Python
                            
                                Python debugging in Eclipse+PyDev
                            
                                merging indexed array in Python
                            
                                Can I turn off implicit Python unicode conversions to find my mixed-strings bugs?
                            
                                send xml file to http using python
                            
                                How to substitute into a regular expression group in Python
                            
                                chat app. for django
                            
                                Project Euler - Problem 160
                            
                                python: elegant way to deal with lock on a variable?
                            
                                How can I check a Python unicode string to see that it *actually* is proper Unicode?
                            
                                twisted + gtk: should I run GUI things in threads, or in the reactor thread?
                            
                                Set Host-header when using Python and urllib2
                            
                                How to create MUC and send messages to existing MUC using Python and XMPP
                            
                                Why aren't persistent connections supported by URLLib2?
                            
                                How to use delete() method in Google App Engine Python's request handler

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Combining a url with urlunparse

Tags:

python

urlparse

Ben

People also ask

2 Answers

Alex Martelli

Ned Batchelder

Recent Activity

Donate For Us