I'm trying to get only "Text3" part with the following code:
import re
stringtotest = "begin:Text1<wrong>Text2<wrong>Text3<right>Text4<wrong>"
right = re.findall("<wrong>(.+?)<right>",stringtotest)
>>> right
['Text2<wrong>Text3']
Why Python gives me Text2 as well? How to tell him I want only the part after the nearest "wrong"? Thank you.
The dot . matches anything. You can use a negated character class to restrict the match:
<wrong>([^<]+?)<right>
If you want to get the middle section without the outer tags, use lookaheads and lookbehinds to assert the position of the tags:
(?<=<wrong>)([^<]+?)(?=<right>)
<wrong>((?:(?!<wrong>).)*)<right>
You can use a negated lookahead based quantifier.See demo.
https://regex101.com/r/8yUhDL/1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With