Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python matching words with same index in string

I have two strings of equal length and want to match words that have the same index. I am also attempting to match consecutive matches which is where I am having trouble.

For example I have two strings

alligned1 = 'I am going to go to some show'
alligned2 = 'I am not going to go the show'

What I am looking for is to get the result:

['I am','show']

My current code is as follow:

keys = []
for x in alligned1.split():
    for i in alligned2.split():
        if x == i:
            keys.append(x)

Which gives me:

['I','am','show']

Any guidance or help would be appreciated.

like image 290
GNMO11 Avatar asked Apr 21 '15 15:04

GNMO11


1 Answers

Finding matching words is fairly simple, but putting them in contiguous groups is fairly tricky. I suggest using groupby.

import itertools

alligned1 = 'I am going to go to some show'
alligned2 = 'I am not going to go the show'

results = []
word_pairs = zip(alligned1.split(), alligned2.split())
for k, v in itertools.groupby(word_pairs, key = lambda pair: pair[0] == pair[1]):
    if k: 
        words = [pair[0] for pair in v]
        results.append(" ".join(words))

print results

Result:

['I am', 'show']
like image 70
Kevin Avatar answered Sep 20 '22 07:09

Kevin