I have a web and a mobile dictionary application that uses SQL Server. I am trying to implement a simple version of "did you mean" feature. If the phrase that user entered is not exists in the db, I need make a suggestions.
I am planning to use the levenshtein distance algorithm. But there is a point that I couldn't figure out: do I need to calculate the levenshtein distance between user entry and all the words that exists in my db one by one?
Let's assume that I have one million word in my database. When user enters an incorrect word, will I calculate distance a million time?
Obviously that would need a great deal of time. What is the best practice for this situation?
Have you already looked at the SOUNDEX user defined function that is available in SQL Server ?
You could use a trigger which calculates the soundex of a column and saves it next to that column each time the column is updated. When searching, you can calculate the soundex of the search criterium and compare it with the stored soundex-column in the table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With