Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get Solr Suggester to return spelling suggestions as well

I'm currently integrating Apache Solr searching into my platform and using the Suggester functionality for autocompletion. However, the Suggester module does not return spelling suggestions as well, so for example if I search for:

shi

The suggester module returns among others the following:

shirt
shirts

However, if I search for:

shrt

No suggestions are returned. What I'd like to know is:

a) Is it incorrect configuration of the Suggester module that has resulted in this? b) Is the Suggester module built in such a way that it does not return spelling suggestions? c) How can I get the Suggester module to return spelling suggestions as well without having to make a second request for spelling correction suggestions?

I have read the Solr documentation but cannot seem to make a headway with this.

like image 494
newbie Avatar asked Jun 15 '12 08:06

newbie


People also ask

How does SOLR suggester work?

AnalyzingInfixLookupFactory. Analyzes the input text and then suggests matches based on prefix matches to any tokens in the indexed text. This uses a Lucene index for its dictionary.

Which component is used to implement a powerful auto suggestion feature in Search application of SOLR?

” The SuggestComponent in Solr provides users with automatic suggestions for query terms. You can use this to implement a powerful auto-suggest feature in your search application.


1 Answers

You need to configure a spell check component to generate alternate spelling options as described at https://lucene.apache.org/solr/guide/8_1/spell-checking.html

The task consists of following steps:

First, update the schema.xml with a spellcheck field. This often means creating a new field and copying multiple fields to a single spellcheck field:

<field name="spellcheck" type="text_general" 
   indexed="true" 
   stored="false" 
   multiValued="true"/>

<copyField source="id" dest="spellcheck"/>
<copyField source="name" dest="spellcheck"/>
<copyField source="description" dest="spellcheck"/>
<copyField source="longdescription" dest="spellcheck"/>
<copyField source="category" dest="spellcheck"/>
<copyField source="source" dest="spellcheck"/>
<copyField source="merchant" dest="spellcheck"/>
<copyField source="contact" dest="spellcheck"/>

In solrconfig.xml update your request handler and create a solr.SpellCheckComponent and add it to your search handler.

    <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
      <lst name="spellchecker">
        <!-- decide between dictionary based vs index based spelling suggestions, 
        in most cases it makes sense to use index based spell checker
        as it only generates terms which are 
        actually present in your search corpus -->
        <str name="classname">solr.IndexBasedSpellChecker</str>
        <!-- field to use -->
        <str name="field">spellcheck</str>
        <!-- buildOnCommit|buildOnOptimize -->
        <str name="buildOnCommit">true</str>
        <!-- $solr.solr.home/data/spellchecker-->
        <str name="spellcheckIndexDir">./spellchecker</str>
        <str name="accuracy">0.7</str>
        <float name="thresholdTokenFrequency">.0001</float>
      </lst>
    </searchComponent>

    <requestHandler name="/select" class="solr.SearchHandler">
      <lst name="defaults">
        <str name="echoParams">explicit</str>
        <int name="rows">10</int>
        <str name="df">defaultSearchField</str>
        <!-- spell check component configuration -->
        <str name="spellcheck">true</str>
        <str name="spellcheck.count">5</str>
        <str name="spellcheck.collate">true</str>
        <str name="spellcheck.maxCollationTries">5</str>
      </lst>
      <!-- add spell check processing after 
        the default search component. This is 
        the search component name. -->
      <arr name="last-components">
        <str>spellcheck</str>
      </arr>
    </requestHandler>
  • Reindex the corpus

  • Test suggestions are working. For example,

http://localhost:8983/solr/select/?q=coachin

{
  "responseHeader": {
    "status": 0,
    "QTime": 12,
    "params": {
      "indent": "true",
      "q": "coachin"
    }
  },
  "response": {
    "numFound": 0,
    "start": 0,
    "docs": []
  },
  "spellcheck": {
    "suggestions": [
      "coachin", {
        "numFound": 1,
        "startOffset": 0,
        "endOffset": 7,
        "suggestion": ["cochin"]
      }
    ]
  }
}
like image 98
Nitin Tripathi Avatar answered Oct 19 '22 05:10

Nitin Tripathi