Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to configure Solr to use Levenshtein approximate string matching?

Does Apaches Solr search engine provide approximate string matches, e.g. via Levenshtein algorithm?

I'm looking for a way to find customers by last name. But I cannot guarantee the correctness of the names. How can I configure Solr so that it would find the person "Levenshtein" even if I search for "Levenstein" ?

like image 429
prinzdezibel Avatar asked Nov 17 '09 22:11

prinzdezibel


2 Answers

Typically this is done with the SpellCheckComponent, which internally uses the Lucene SpellChecker by default, which implements Levenshtein.

The wiki really explains very well how it works, how to configure it and what options are available, no point repeating it here.

Or you could just use Lucene's fuzzy search operator.

Another option is using a phonetic filter instead of Levenshtein.

like image 178
Mauricio Scheffer Avatar answered Nov 11 '22 16:11

Mauricio Scheffer


Great answer by Mauricio, my only "cheapo" addition is to just append the ~ character to all terms that you want to fuzzy match on the way in to solr. If you are using the default set up, this will give you fuzzy match.

like image 4
MattMcKnight Avatar answered Nov 11 '22 16:11

MattMcKnight