Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

get solr autosuggest results for phrase

I want to use solr to create an autosuggestion dropdown for a search field but I am getting stuck when trying to get suggestions for a phrase. When I search for "dog t", I want to get one result set containing phrases such as "dog treat", "dog trick", "dog tags",... but instead I get 2 result sets, one for "dog" (such as "dogs" "dog bone" "doggy"...) and another for "t" (such as "tree" "time"...)

My query url is:

http://localhost:8985/solr/mycollection/suggest?q=%22dog%20t%22&wt=json

and my request handler is defined in solrconfig as...

<searchComponent class="solr.SpellCheckComponent" name="suggest">
    <lst name="spellchecker">
      <str name="name">suggest</str>
      <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
      <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
      <str name="field">suggest</str>  <!-- the indexed field to derive suggestions from -->
      <float name="threshold">0.0001</float>
      <str name="buildOnCommit">true</str>
    </lst>
<str name="queryAnalyzerFieldType">textSuggest</str>
  </searchComponent>

  <requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
    <lst name="defaults">
      <str name="spellcheck">true</str>
      <str name="spellcheck.dictionary">suggest</str>
      <str name="spellcheck.onlyMorePopular">true</str>
      <str name="spellcheck.count">10</str>
    </lst>
    <arr name="components">
      <str>suggest</str>
    </arr>
  </requestHandler>

The fieldtype of "suggest" is defined in schema as

    <fieldType name="textSuggest" class="solr.TextField" positionIncrementGap="100" >
      <analyzer type="index">
 <tokenizer class="solr.KeywordTokenizerFactory"/>
   <filter class="solr.LowerCaseFilterFactory"/>
 <filter class="solr.ShingleFilterFactory" maxShingleSize="3" outputUnigrams="true" />
<filter class="solr.WordDelimiterFilterFactory"
                generateWordParts="1"
                generateNumberParts="1"
                catenateWords="0"
                catenateNumbers="0"
                catenateAll="0"
                preserveOriginal="1" splitOnCaseChange="1"
                />
 </analyzer>
<analyzer type="query">
   <tokenizer class="solr.KeywordTokenizerFactory"/>
   <filter class="solr.LowerCaseFilterFactory"/>
 </analyzer>
    </fieldType>
like image 842
jessieloo Avatar asked Sep 25 '12 21:09

jessieloo


1 Answers

I found 2 solutions to my issue...

One is to create a custom queryHandler that doesn't split up the q parameter into multiple words.

  • http://wiki.apache.org/solr/Suggester?highlight=%28suggest%29#Tips_and_tricks
  • https://issues.apache.org/jira/browse/SOLR-3143
  • http://lucene.472066.n3.nabble.com/suggester-issues-td3262718.html

The other option, which I chose, is to use the parameter spellcheck.q instead of q. I was using solr 3.4.0 and using spellcheck.q gave me a 500 error. I updated solr to 3.6.1 and it seems to work correctly now.

  • http://lucene.472066.n3.nabble.com/Spell-Checking-a-multi-word-phrase-td2274531.html
  • SolR : NullPointerException when using spellcheck.q
like image 189
jessieloo Avatar answered Oct 02 '22 00:10

jessieloo