Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to regex split, but keep the split string?

Tags:

python

regex

I have the following URL pattern:

http://www.hulu.jp/watch/589851/supernatural-dub-hollywood-babylon/en

I would like to get everything up until and inclusive of /watch/\d+/.

So far I have:

>>> re.split(r'watch/\d+/', 'http://www.hulu.jp/watch/589851/supernatural-dub-hollywood-babylon/en')
['http://www.hulu.jp/', 'supernatural-dub-hollywood-babylon/en']

But this does not include the split string (the string which appears between the domain and the path). The end answer I want to achieve is:

http://www.hulu.jp/watch/589851
like image 527
David542 Avatar asked Jan 09 '23 06:01

David542


1 Answers

You need to use capture group :

>>> re.split(r'(watch/\d+/)', 'http://www.hulu.jp/watch/589851/supernatural-dub-hollywood-babylon/en')
['http://www.hulu.jp/', 'watch/589851/', 'supernatural-dub-hollywood-babylon/en']
like image 195
Mazdak Avatar answered Jan 14 '23 15:01

Mazdak