I am working with an application that returns urls, written with Flask
. I want the URL
displayed to the user to be as clean as possible so I want to remove the http://
from it. I looked and found the urlparse
library, but couldn't find any examples of how to do this.
What would be the best way to go about it, and if urlparse
is overkill is there a simpler way? Would simply removing the "http://
" substring from the URL just using the regular string parsing tools be bad practice or cause problems?
Use the re. sub() method to remove URLs from text, e.g. result = re. sub(r'http\S+', '', my_string) .
replace('%20+', '') will replace '%20+' with empty string.
Newline code \n (LF), \r\n (CR + LF) Inserting a newline code \n , \r\n into a string will result in a line break at that location. On Unix, including Mac, \n (LF) is often used, and on Windows, \r\n (CR + LF) is often used as a newline code.
I don't think urlparse
offers a single method or function for this. This is how I'd do it:
from urlparse import urlparse
url = 'HtTp://stackoverflow.com/questions/tagged/python?page=2'
def strip_scheme(url):
parsed = urlparse(url)
scheme = "%s://" % parsed.scheme
return parsed.geturl().replace(scheme, '', 1)
print strip_scheme(url)
Output:
stackoverflow.com/questions/tagged/python?page=2
If you'd use (only) simple string parsing, you'd have to deal with http[s]
, and possibly other schemes yourself. Also, this handles weird casing of the scheme.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With