Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split string on ". ","! " or "? " keeping the punctuation mark [duplicate]

Possible Duplicate:
Python split() without removing the delimiter

I wish to split a string as follows:

text = " T?e  qu!ck ' brown 1 fox!     jumps-.ver. the 'lazy' doG?  !"
result -> (" T?e  qu!ck ' brown 1 fox!", "jumps-.ver.", "the 'lazy' doG?", "!")

So basically I want to split at ". ", "! " or "? " but I want the spaces at the split points to be removed but not the dot, comma or question-mark.

How can I do this in an efficient way?

The str split function takes only on separator. I wonder is the best solution to split on all spaces and then find those that end with dot, comma or question-mark when constructing the required result.

like image 932
Baz Avatar asked May 25 '26 14:05

Baz


1 Answers

You can achieve this using a regular expression split:

>>> import re
>>> text = " T?e  qu!ck ' brown 1 fox! jumps-.ver. the 'lazy' doG?  !"
>>> re.split('(?<=[.!?]) +',text)
[" T?e  qu!ck ' brown 1 fox!", 'jumps-.ver.', "the 'lazy' doG?", '!']

The regular expression '(?<=[.!?]) +' means match a sequence of one or more spaces (' +') only if preceded by a ., ! or ? character ('(?<=[.!?])').

like image 68
isedev Avatar answered May 27 '26 02:05

isedev



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!