So I have a list of strings as below:
list = ["I love cat", "I love dog", "I love fish", "I hate banana", "I hate apple", "I hate orange"]
How do I iterate through the list and group partially matching strings without given keywords. The result should like below:
list 1 = [["I love cat","I love dog","I love fish"],["I hate banana","I hate apple","I hate orange"]]
Thank you so much.
To find a list of partial query matches given a string list lst , combine the membership operator with the filter() function in which you pass a lambda function that evaluates the membership operation for each element in the list like so: list(filter(lambda x: query in x, lst)) .
Use the in operator for partial matches, i.e., whether one string contains the other string. x in y returns True if x is contained in y ( x is a substring of y ), and False if it is not. If each character of x is contained in y discretely, False is returned.
Sequence matcher will do the task for you. Tune the score ratio for better results.
Try this:
from difflib import SequenceMatcher
sentence_list = ["I love cat", "I love dog", "I love fish", "I hate banana", "I hate apple", "I hate orange"]
result=[]
for sentence in sentence_list:
if(len(result)==0):
result.append([sentence])
else:
for i in range(0,len(result)):
score=SequenceMatcher(None,sentence,result[i][0]).ratio()
if(score<0.5):
if(i==len(result)-1):
result.append([sentence])
else:
if(score != 1):
result[i].append(sentence)
Output:
[['I love cat', 'I love dog', 'I love fish'], ['I hate banana', 'I hate apple', 'I hate orange']]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With