Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split a text by specific word or phrase and keep the word in Python

Is there any elegant way of splitting a text by a word and keep the word as well. Although there are some works around split with re package and pattern like (Python RE library String Split but keep the delimiters/separators as part of the next string), but none of them works for this scenario when the delimiter is repeated multiple times. For example:

 s = "I want to split text here, and also keep here, and return all as list items"

Using partition:

 s.partition("here")
>> ('I want to split text ', 'here', ', and also keep here, and return all as list items')

Using re.split():

re.split("here",s)
>> ['I want to split text ', ', and also keep ', ', and return all as list items']

The desired output should be something to the following list:

['I want to split text', 'here', ' , and also keep ', 'here', ' , and return all as list items']
like image 634
Sam S. Avatar asked Oct 17 '25 02:10

Sam S.


2 Answers

Yes. What you're looking for is a feature of the re.split() method. If you use a capture group in the expression, it will return the matched terms as well:

import re

s = "I want to split text here, and also keep here, and return all as list items"

r = re.split('(here)', s)

print(r)

Result:

['I want to split text ', 'here', ', and also keep ', 'here', ', and return all as list items']

If you define multiple capture groups, it will return each of them individually. So you can return just a part of the delimiter, or multiple parts that each get returned. I've done some fairly crazy things with this feature in the past. It can replace an appreciable amount of code that would otherwise be necessary.

like image 91
CryptoFool Avatar answered Oct 19 '25 18:10

CryptoFool


Using re is no doubt the best way, but you could also extend the partition() method recursively.

def partitions(whole_string, split_string):
    parts_tuple = whole_string.partition(split_string)
    return [parts_tuple[0], parts_tuple[1], *partitions(parts_tuple[2], split_string)] if parts_tuple[1] else [whole_string]
like image 43
bn_ln Avatar answered Oct 19 '25 20:10

bn_ln



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!