Okay, so I found this: How to find all occurrences of a substring?
Which says, to get the indices overlapping occurances of substrings in a list, you can use:
[m.start() for m in re.finditer('(?=SUBSTRING)', 'STRING')]
Which works, but my problem is that both the string and the substring to look for are defined by variables. I don't know enough about regular expressions to know how to deal with it - I can get it to work with non-overlapping substrings, that's just:
[m.start() for m in re.finditer(p3, p1)]
Edit:
Because someone asked, I'll go ahead and specfify. p1 and p3 could be any string, but if they were, for example p3 = "tryt"
and p1 = "trytryt"
, the result should be [0, 3]
.
The arguments to re.finditer
are simple strings. If you have the substring in a variable simply format it into the regular expression. Something like '(?={0})'.format(p3)
is a start. Since various symbols do have special meaning in a RE you will want to escape them. Luckily the re
module includes re.escape
for just such a need.
[m.start() for m in re.finditer('(?={0})'.format(re.escape(p3)), p1)]
Regex might be overkill here:
>>> word = 'tryt'
>>> text = 'trytryt'
>>> [i for i, _ in enumerate(text) if text.startswith(word, i)]
[0, 3]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With