Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove scheme from url in Python?

I am working with an application that returns urls, written with Flask. I want the URL displayed to the user to be as clean as possible so I want to remove the http:// from it. I looked and found the urlparse library, but couldn't find any examples of how to do this.

What would be the best way to go about it, and if urlparse is overkill is there a simpler way? Would simply removing the "http://" substring from the URL just using the regular string parsing tools be bad practice or cause problems?

like image 412
Lucifer N. Avatar asked Feb 10 '14 20:02

Lucifer N.


People also ask

How do I remove text from URL in Python?

Use the re. sub() method to remove URLs from text, e.g. result = re. sub(r'http\S+', '', my_string) .

How do I remove 20 from a URL in Python?

replace('%20+', '') will replace '%20+' with empty string.

How do you break a link in Python?

Newline code \n (LF), \r\n (CR + LF) Inserting a newline code \n , \r\n into a string will result in a line break at that location. On Unix, including Mac, \n (LF) is often used, and on Windows, \r\n (CR + LF) is often used as a newline code.


1 Answers

I don't think urlparse offers a single method or function for this. This is how I'd do it:

from urlparse import urlparse

url = 'HtTp://stackoverflow.com/questions/tagged/python?page=2'

def strip_scheme(url):
    parsed = urlparse(url)
    scheme = "%s://" % parsed.scheme
    return parsed.geturl().replace(scheme, '', 1)

print strip_scheme(url)

Output:

stackoverflow.com/questions/tagged/python?page=2

If you'd use (only) simple string parsing, you'd have to deal with http[s], and possibly other schemes yourself. Also, this handles weird casing of the scheme.

like image 99
Lukas Graf Avatar answered Oct 08 '22 18:10

Lukas Graf