Soundex seems to be implemented in some DBMS's, but have there been any algorithmic improvements that are definitively better than the current implementation of Soundex?
Yes. As Wikipedia points out, there's Metaphone and Double Metaphone, NYSIIS and more.
Keep in mind that these only works for English, which has its own particular problems with its orthography. It's hardly needed for Spanish, and doesn't make sense for Chinese/Mandarin.
I don't know about "definitively better", but you might want to look at Metaphone (and its variants) and Caverphone. See, e.g., http://www.atomodo.com/code/double-metaphone where there's an implementation of "Double Metaphone" for use with MYSQL.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With