Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Regular Expression For Domain Names

I am trying use the following regular expression to extract domain name from a text, but it just produce nothing, what's wrong with it?

I don't know if this is suitable to ask this "fix code" question, maybe I should read more.

I just want to save some time.

Thanks.

pat_url = re.compile(r'''

            (?:https?://)*

            (?:[\w]+[\-\w]+[.])*

            (?P<domain>[\w\-]*[\w.](com|net)([.](cn|jp|us))*[/]*)

            ''')

print re.findall(pat_url,"http://www.google.com/abcde")

I want the output to be google.com.

like image 246
yasein Avatar asked Jun 23 '26 03:06

yasein


1 Answers

Don't use regex for this. Use the urlparse standard library instead. It's far more straightforward and easier to read/maintain.

http://docs.python.org/library/urlparse.html

like image 138
Amber Avatar answered Jun 24 '26 17:06

Amber



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!