I'm trying to match the pattern:
<--Header Title-->
some body text
The following only matches the first occurrence:
string1 = """<-- Option 1 -->
Nice text
<--Final stuff-->
Listing all
of
the
text
"""
regex = re.compile(r"<--([\w\s]+)-->([\s\S]*?)(?=\n<--|$)")
m = regex.search(string1)
print m.groups()
Which results in:
(' Option 1 ', '\nNice text')
However, it seems to work fine using pythex.
What am I doing wrong?
Re.search only matches the first occurrence within the string. You want finditer or findall.
re.search
Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.
Finditer returns match objects for all locations within the target string, yielding an iterator, while findall returns the substrings for all matches.
>>> import re
>>> re.findall('a', 'ababababa')
['a', 'a', 'a', 'a', 'a']
>>> x = list(re.finditer('a', 'ababababa'))
>>> x
[<_sre.SRE_Match object; span=(0, 1), match='a'>,
<_sre.SRE_Match object; span=(2, 3), match='a'>,
<_sre.SRE_Match object; span=(4, 5), match='a'>,
<_sre.SRE_Match object; span=(6, 7), match='a'>,
<_sre.SRE_Match object; span=(8, 9), match='a'>]
>>> x[0].group()
'a'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With