Are there any algorithms that would find the closest match to a string from a collection of strings? For example:
string_to_match = 'What color is the sky?'
strings = [
'What colour is the sea?',
'What colour is the sky?',
'What colour is grass?',
'What colour is earth?'
]
answer = method_using_string_matching_algorithm(string_to_match, strings)
answer # returns strings[1] 'What colour is the sky?'
The search terms you're looking for are "string distance algorithms" and "approximate string matching." A quick check of Google turns up interesting options such as:
Some useful links include:
As of this writing, Debian-based Linux distributions also include agrep and TRE-agrep in their repositories.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With