Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String Matching with wildcard in Python

I'm trying to find the location of a substring within a string that contains wildcards. For example:

substring = 'ABCDEF'
large_string = 'QQQQQABC.EFQQQQQ'

start = string.find(substring, large_string)
print(start)

5

thank you in advance

like image 743
xygonyx Avatar asked Sep 14 '19 14:09

xygonyx


1 Answers

The idea is to convert what you are looking for, ABCDEF in this case, into the following regular expression:

([A]|\.)([B]|\.)([C]|\.)([D]|\.)([E]|\.)([F]|\.)

Each character is placed in [] in case it turns out to be a regex special character. The only complication is if one of the search characters is ^, as in ABCDEF^. The ^ character should just be escaped and is therefore handled specially.

Then you search the string for that pattern using re.search:

import re

substring = 'ABCDEF'
large_string = 'QQQQQABC.EF^QQQQQ'

new_substring = re.sub(r'([^^])', r'([\1]|\\.)', substring)
new_substring = re.sub(r'\^', r'(\\^|\\.)', new_substring)
print(new_substring)
regex = re.compile(new_substring)
m = regex.search(large_string)
if (m):
    print(m.span())

Prints:

([A]|\.)([B]|\.)([C]|\.)([D]|\.)([E]|\.)([F]|\.)
(5, 11)
like image 66
Booboo Avatar answered Oct 22 '22 18:10

Booboo