Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trying to find a match in two strings - Python

I have a user inputting two strings and then I want to check if there are any similar characters and if there is, get the position where the first similarity occurs, without using the find or index function.

Below is what I have so far but I doesn't fully work. With what I have so far, I'm able to find the similarities but Im not sure how to find the position of those similarities without using the index function.

string_a = "python"

string_b = "honbe"

same = []

a_len = len(string_a)
b_len = len(string_b)

for a in string_a:
    for b in string_b:

        if a == b:
            same.append(b)          

print (same)

Right now the output is:

['h', 'o', 'n']

So basically what I am asking is, how can I find the position of those characters without using the Python Index function?

like image 332
l00kitsjake Avatar asked Oct 31 '13 21:10

l00kitsjake


2 Answers

This is a perfect use case for difflib.SequenceMatcher:

import difflib

string_a = 'python'
string_b = 'honbe'

matcher = difflib.SequenceMatcher(a=string_a, b=string_b)
match = matcher.find_longest_match(0, len(matcher.a), 0, len(matcher.b))

The match object will have the attributes a, b, and size, where a is the starting index from the string matcher.a, b is the starting index from matcher.b, and size is the length of the match.

For example:

>>> match
Match(a=3, b=0, size=3)
>>> matcher.a[match.a:match.a+match.size]
'hon'
>>> match.a
3
>>> match.b
0
like image 151
Andrew Clark Avatar answered Oct 04 '22 01:10

Andrew Clark


You can solve this problem using a combination of list comprehensions and itertools.

import itertools
string_a = 'hello_world'
string_b = 'hi_low_old'

same = [ i for i,x in enumerate(itertools.izip(string_a,string_b)) if all(y==x[0] for y in x)]

In [38]: same
Out[38]: [0, 3, 4, 7]

Here we compare the two strings element by element and return all the indexes that have been found to be similar. The output can be easily changed to include the characters that matched etc. This method scales easily to compare multiple words.

like image 29
RMcG Avatar answered Oct 03 '22 23:10

RMcG