Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a standard way in Python to fuzzy match a string with arbitrary list of acceptable values?

I am hoping for a function like this:

def findSimilar(string, options):
    ....
    return aString

Where aString is similar to the passed string but is present in options. I'm using this function to normalize user input from the toy application I'm working on. I read about using levenshtein distance, but I decided to ask here, as I'm hoping there is a simple solution in the Python standard libraries.

like image 220
Others Avatar asked Dec 01 '22 19:12

Others


2 Answers

Use difflib.get_close_matches.

get_close_matches(word, possibilities[, n][, cutoff])

Return a list of the best “good enough” matches. word is a sequence for which close matches are desired (typically a string), and possibilities is a list of sequences against which to match word (typically a list of strings).

like image 85
shx2 Avatar answered Dec 16 '22 12:12

shx2


Calculate the Levenshtein distance:

http://en.wikipedia.org/wiki/Levenshtein_distance

There are already python implementations, although I don't know about their quality...

like image 31
Benjamin Avatar answered Dec 16 '22 12:12

Benjamin