I'm trying to find the location of a substring within a string that contains wildcards. For example:
substring = 'ABCDEF'
large_string = 'QQQQQABC.EFQQQQQ'
start = string.find(substring, large_string)
print(start)
5
thank you in advance
The idea is to convert what you are looking for, ABCDEF
in this case, into the following regular expression:
([A]|\.)([B]|\.)([C]|\.)([D]|\.)([E]|\.)([F]|\.)
Each character is placed in []
in case it turns out to be a regex special character. The only complication is if one of the search characters is ^
, as in ABCDEF^
. The ^
character should just be escaped and is therefore handled specially.
Then you search the string for that pattern using re.search
:
import re
substring = 'ABCDEF'
large_string = 'QQQQQABC.EF^QQQQQ'
new_substring = re.sub(r'([^^])', r'([\1]|\\.)', substring)
new_substring = re.sub(r'\^', r'(\\^|\\.)', new_substring)
print(new_substring)
regex = re.compile(new_substring)
m = regex.search(large_string)
if (m):
print(m.span())
Prints:
([A]|\.)([B]|\.)([C]|\.)([D]|\.)([E]|\.)([F]|\.)
(5, 11)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With