How to regex split, but keep the split string?

Question

I have the following URL pattern:

http://www.hulu.jp/watch/589851/supernatural-dub-hollywood-babylon/en

I would like to get everything up until and inclusive of /watch/\d+/.

So far I have:

>>> re.split(r'watch/\d+/', 'http://www.hulu.jp/watch/589851/supernatural-dub-hollywood-babylon/en')
['http://www.hulu.jp/', 'supernatural-dub-hollywood-babylon/en']

But this does not include the split string (the string which appears between the domain and the path). The end answer I want to achieve is:

http://www.hulu.jp/watch/589851

Mazdak · Accepted Answer

You need to use capture group :

>>> re.split(r'(watch/\d+/)', 'http://www.hulu.jp/watch/589851/supernatural-dub-hollywood-babylon/en')
['http://www.hulu.jp/', 'watch/589851/', 'supernatural-dub-hollywood-babylon/en']

How to regex split, but keep the split string?

Tags:

python

regex

David542

1 Answers

Mazdak

Recent Activity

Donate For Us

How to regex split, but keep the split string?

Tags:

python

regex

David542

1 Answers

Mazdak

Related questions

Recent Activity

Donate For Us