Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python regex replace sentence with starting word

This is probably one of those simple things that I am missing, but I have not been able to find a solution that would solve my issue.

I have two strings that are in the following format:

s1 = '87, 72 Start I am a sentence finish'
s2 = '93, 83 Start I am a sentence end'

Following this answer, Replace all text between 2 strings python, I am able to replace a phrase when given a start and end word, as the following.

import re
s1 = '87, 72 Start I am a sentence finish'
s2 = '93, 83 Start I am a sentence end'

print(re.sub("Start.*?finish", '', s1, re.DOTALL).strip())
print(re.sub("Start.*?end", '', s2, re.DOTALL).strip())

>>> 87, 72
>>> 93, 83

In my case, I will have conditions where the starting word is the same, but the ending word could be different.

Is it possible to replace the desired phrase by providing only the starting word?

I have tried this, but it only replaces the starting word.

s1 = '87, 72 Start I am a sentence finish'
print(re.sub("Start.*?", '', v1, re.DOTALL).strip())

>>> 87, 72 I am a sentence finish
like image 252
Wondercricket Avatar asked May 08 '15 17:05

Wondercricket


3 Answers

Use an end of line anchor $ and greedy matching .*:

print(re.sub("Start.*$", '', v1, re.DOTALL).strip())

See demo

Sample code:

import re
p = re.compile(ur'Start.*$')
test_str = u"87, 72 Start I am a sentence finish"
result = re.sub(p, "", test_str).strip()
print result

Output:

87, 72
like image 173
Wiktor Stribiżew Avatar answered Nov 12 '22 21:11

Wiktor Stribiżew


You can use "$" to match the "end of line", so "Start.*$" should do it.

like image 40
Buddy Avatar answered Nov 12 '22 22:11

Buddy


Also.. you can just remove ? (non greedy) in your regex.. it will match till end by default.. (greedy and no need to use $ here)

print(re.sub("Start.*", '', v1, re.DOTALL).strip())

See DEMO

Input:

'87, 72 Start I am a sentence finish'

Output:

>>> 87, 72
like image 2
karthik manchala Avatar answered Nov 12 '22 21:11

karthik manchala