Let's say that my program receives an input such as a string of characters that has any type of character. For example, 'Bob's Bagel Shop'. Then it gets another string that says 'Fred's Bagel Store'. How can I use regular expressions or some other module in python to compare these and have my program tell me if at least 5 (or any number I want) of the characters are the same anywhere in the string, but all in the same order, such as the word 'Bagel'?
Thanks.
There's a Python standard library class difflib.SequenceMatcher
that will help to solve your problem. Here's a code sample:
from difflib import SequenceMatcher
s1 = "Bob's Bagel Shop"
s2 = "Bill's Bagel Shop"
matcher = SequenceMatcher(a=s1, b=s2)
match = matcher.find_longest_match(0, len(s1), 0, len(s2))
Result:
Match(a=3, b=4, size=13) # value that 'match' variable holds
The result shows that both string has equal substring with 13 characters length (starting from 3-rd char in first string and 4-th char in second string).
You can use this match result object to get its fields as values:
match.size # 13
match.a # 3
match.b # 4
you can use itetools.combinations
and then use intersection
of sets to find out matching characters from both strings:
from itertools import combinations
str1="Bob's Bagel Shop"
str2="Fred's Bagel Store"
def combi(strs):
chars=''.join(strs.split())
lis=[]
for x in range(1,len(chars)):
for y in combinations(chars,x):
if ''.join(y) in chars:
lis.append(''.join(y))
return lis
lis1=combi(str1)
lis2=combi(str2)
print max(set(lis1).intersection(set(lis2)),key=len)
output:
'sBagelS
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With