I have a list of strings in Java containing first name of a person with dissimilar spellings (not entirely different). For example, John may be spelled as Jon, Jawn, Jaun etc. How should I retrieve the most appropriate string in this list. If anyone can suggest a method how to use Soundex in this case, it shall be of great help.
You have use approximate string matching algorithm , There are several strategies to implement this . Blur is a Trie-based Java implementation of approximate string matching based on the Levenshtein word distance.
There is another strategy to implement its called boyer-moore approximate string matching algorithm.
The usual approach to solve these problem using this algorithm and Levenshtein word distance is to compare the input to the possible outputs and choose the one with the smallest distance to the desired output.
There is one jar file for matching approximate string..
go through link and download frej.jar
http://sourceforge.net/projects/frej/files/
there is one method inside this jar file
Fuzzy.equals("jon","john");
it will return true in this type of approximate string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With