Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fuzzy Matching with threshold filter C#

I need to implement some kind of this:

string textToSearch = "Extreme Golf: The Showdown";
string textToSearchFor = "Golf Extreme Showdown";
int fuzzyMatchScoreThreshold = 80; // One a 0 to 100 scale
bool searchSuccessful = IsFuzzyMatch(textToSearch, textToSearchFor, fuzzyMatchScoreThreshold);
if (searchSuccessful == true)
{
    -- we have a match.
}

Here's the function stub written in C#:

public bool IsFuzzyMatch (string textToSearch, string textToSearchFor, int fuzzyMatchScoreThreshold)
{
   bool isMatch = false;
   // do fuzzy logic here and set isMatch to true if successful match.
   return isMatch;
}

But I have no any idea how to implement logic in IsFuzzyMatch method. Any ideas? Perhaps there is a ready-made solution for this purpose?

like image 583
Nazar Grynko Avatar asked Nov 03 '10 11:11

Nazar Grynko


2 Answers

I like a combination of Dice Coeffiecient, Levenshtein Distance, Longest Common Subsequence, and at times the Double Metaphone. The first three will provide you a threshold value. I prefer to combine them in some way. YMMV.

I've just posted a blog post that has a C# implementation for each of these called Four Functions for Finding Fuzzy String Matches in C# Extensions.

like image 185
Tyler Jensen Avatar answered Nov 01 '22 04:11

Tyler Jensen


You need Levenshtein Distance Algorithm for find how to go from one string to another by operations insert, delete and modify. You fuzzyMatchScoreThreshold is a Levenshtein Distance divided to length of the string in simple way.

like image 39
Singlet Avatar answered Nov 01 '22 03:11

Singlet