I found several similar questions, but I cannot fit my problem to any of them. I try to find and replace a string between two other strings in a text.
reg = "%s(.*?)%s" % (str1,str2)
r = re.compile(reg,re.DOTALL)
result = r.sub(newstring, originaltext)
The problem is that the code above replace also str1 and str2, whereas I want to replace only the text between them. Something obviously that I miss?
Update:
I simplified example:
text = 'abcdefghijklmnopqrstuvwxyz'
str1 = 'gh'
str2 = 'op'
newstring = 'stackexchange'
reg = "%s(.*?)%s" % (str1,str2)
r = re.compile(reg,re.DOTALL)
result = r.sub(newstring, text)
print result
The result is abcdefstackexchangeqrstuvwxyz whereas I need abcdefghstackexchangeopqrstuvwxyz
Use a combination of lookarounds in your regular expression.
reg = "(?<=%s).*?(?=%s)" % (str1,str2)
Explanation:
Lookarounds are zero-width assertions. They don't consume any characters on the string.
(?<=    # look behind to see if there is:
  gh    #   'gh'
)       # end of look-behind
.*?     # any character except \n (0 or more times)
(?=     # look ahead to see if there is:
  op    #   'op'
)       # end of look-ahead
Working Demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With