I have a list of words <pre class="prettyprint"><code>list = ['car', 'animal', 'house', 'animation'] </code></pre> and I want to compare every list item with a string <code>str1</code> and the output should be the most similar word. Example: If <code>str1</code> would be <code>anlmal</code> then <code>animal</code> is the most similar word. How can I do this in python? Usually the words I have in my list are good distinguishable from each other.

I checked difflib.get_close_matches(), but it didn't work for me correctly. I write here a robust solution, use as: closest_match, closest_match_idx = find_closet_match(test_str, list2check) <pre class="prettyprint"><code>def find_closet_match(test_str, list2check): scores = {} for ii in list2check: cnt = 0 if len(test_str)<=len(ii): str1, str2 = test_str, ii else: str1, str2 = ii, test_str for jj in range(len(str1)): cnt += 1 if str1[jj]==str2[jj] else 0 scores[ii] = cnt scores_values = numpy.array(list(scores.values())) closest_match_idx = numpy.argsort(scores_values, axis=0, kind='quicksort')[-1] closest_match = numpy.array(list(scores.keys()))[closest_match_idx] return closest_match, closest_match_idx </code></pre>

How to find the most similar word in a list in python

Tags:

python

I have a list of words

list = ['car', 'animal', 'house', 'animation']

and I want to compare every list item with a string str1 and the output should be the most similar word. Example: If str1 would be anlmal then animal is the most similar word. How can I do this in python? Usually the words I have in my list are good distinguishable from each other.

438

asked Oct 09 '14 16:10

JohnB

2 Answers

Use difflib:

difflib.get_close_matches(word, ['car', 'animal', 'house', 'animation'])

As you can see from perusing the source, the "close" matches are sorted from best to worst.

>>> import difflib
>>> difflib.get_close_matches('anlmal', ['car', 'animal', 'house', 'animation'])
['animal']

114

answered Nov 03 '22 11:11

mgilson

I checked difflib.get_close_matches(), but it didn't work for me correctly. I write here a robust solution, use as:

closest_match, closest_match_idx = find_closet_match(test_str, list2check)

def find_closet_match(test_str, list2check):
scores = {}
for ii in list2check:
    cnt = 0
    if len(test_str)<=len(ii):
        str1, str2 = test_str, ii
    else:
        str1, str2 = ii, test_str
    for jj in range(len(str1)):
        cnt += 1 if str1[jj]==str2[jj] else 0
    scores[ii] = cnt
scores_values        = numpy.array(list(scores.values()))
closest_match_idx    = numpy.argsort(scores_values, axis=0, kind='quicksort')[-1]
closest_match        = numpy.array(list(scores.keys()))[closest_match_idx]
return closest_match, closest_match_idx

answered Nov 03 '22 11:11

amit

Related questions
                            
                                Cocoa - Where is the link between a NSCollectionView and a NSCollectionViewItem? Xcode 6 Bug?
                            
                                ggplot2: Different legend symbols for points and lines
                            
                                uses-sdk element cannot have a "tools:node" attribute
                            
                                React js: Invariant Violation: processUpdates() when rendering a table with a different number of child rows
                            
                                Reading data from a CSV file in Python
                            
                                How to convert a float string to an integer in python 3
                            
                                How to change a machine type on Google Compute Engine?
                            
                                Getting chrome performance and tracing logs
                            
                                Spring @Scheduled annotation random delay
                            
                                Safe to install Visual Studio 2015 Preview side-by-side Visual Studio 2013
                            
                                In Java why this error: 'attribute value must be constant'?
                            
                                Passing function with parameters in ng-class to get the class

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With