I have two strings of equal length and want to match words that have the same index. I am also attempting to match consecutive matches which is where I am having trouble.
For example I have two strings
alligned1 = 'I am going to go to some show'
alligned2 = 'I am not going to go the show'
What I am looking for is to get the result:
['I am','show']
My current code is as follow:
keys = []
for x in alligned1.split():
for i in alligned2.split():
if x == i:
keys.append(x)
Which gives me:
['I','am','show']
Any guidance or help would be appreciated.
Finding matching words is fairly simple, but putting them in contiguous groups is fairly tricky. I suggest using groupby
.
import itertools
alligned1 = 'I am going to go to some show'
alligned2 = 'I am not going to go the show'
results = []
word_pairs = zip(alligned1.split(), alligned2.split())
for k, v in itertools.groupby(word_pairs, key = lambda pair: pair[0] == pair[1]):
if k:
words = [pair[0] for pair in v]
results.append(" ".join(words))
print results
Result:
['I am', 'show']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With