Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing names

Is there any simple algorithm to determine the likeliness of 2 names representing the same person?

I'm not asking for something of the level that Custom department might be using. Just a simple algorithm that would tell me if 'James T. Clark' is most likely the same name as 'J. Thomas Clark' or 'James Clerk'.

If there is an algorithm in C# that would be great, but I can translate from any language.

like image 985
ADB Avatar asked Feb 03 '26 21:02

ADB


1 Answers

Sounds like you're looking for a phonetic-based algorithms, such as soundex, NYSIIS, or double metaphone. The first actually is what several government departments use, and is trivial to implement (with many implementations readily available). The second is a slightly more complicated and more precise version of the first. The latter-most works with some non-English names and alphabets.

Levenshtein distance is a definition of distance between two arbitrary strings. It gives you a distance of 0 between identical strings and non-zero between different strings, which might also be useful if you decide to make a custom algorithm.

like image 181
Andrey Fedorov Avatar answered Feb 06 '26 10:02

Andrey Fedorov



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!