Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to have Solr autocomplete on whole phrase when query contains multiple terms?

Tags:

solr

I've looked through a ton of examples and other questions here and from them, I've got my config very close to what I need but I'm missing one last little bit that I'm having a heck of a time working out. I'm searching on values like:

solar powered
solar glass
solar globe
solar lights
solar magic
solid brass
solid copper

What I want:

  1. If I search for sol the result should include all these values. This works.
  2. If I search for solar I should get just the first five. This works.
  3. If I search for solar gl I should get only solar glass and solar globe. This does not work. Instead, I get one set of matches for solar and a second set of matches for gl.

In a nutshell, I want to consider the input string as a whole, regardless of any whitespace. I gather this is accomplished by creating a separate query (versus index) analyzer, but I've not been able to make it work. Can anyone suggest a configuration that will get me what I'm looking for?

I've (unsuccessfully) tried:

  • Querying with "solar gl"
  • Querying with mm=100%
  • Defining separate query and index analyzers both using KeywordTokenizerFactory. (I don't know what the heck I thought that would do.)
  • Defining an index analyzer but not a query analyzer.
  • Defining a query analyzer with no tokenizer.

Here's my current schema:

<field name="suggest_phrase" type="suggest_phrase"
    indexed="true" stored="false" multiValued="false" />

And the field definition:

<fieldType name="suggest_phrase" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
        <tokenizer class="solr.KeywordTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
    </analyzer>
</fieldType>

And the config:

<searchComponent name="suggest_phrase" class="solr.SpellCheckComponent">
    <lst name="spellchecker">
        <str name="name">suggest_phrase</str>
        <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
        <str name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookup</str>
        <str name="field">suggest_phrase</str>
        <str name="buildOnCommit">true</str>
    </lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest_phrase">
    <lst name="defaults">
        <str name="spellcheck">true</str>
        <str name="spellcheck.dictionary">suggest_phrase</str>
        <str name="spellcheck.onlyMorePopular">true</str>
        <str name="spellcheck.count">10</str>
        <str name="spellcheck.collate">false</str>
    </lst>
    <arr name="components">
        <str>suggest_phrase</str>
    </arr>
</requestHandler>
like image 339
Alex Howansky Avatar asked Aug 08 '13 17:08

Alex Howansky


2 Answers

Found the answer, finally! I knew I was really close. Turns out my configuration above was correct and I simply needed to change my query.

  1. Use KeywordTokenizerFactory so that the strings get indexed as a whole.
  2. Use SpellCheckComponent for the request handler.
  3. The piece I was missing -- don't query with q=<string> but with spellcheck.q=<string>.

Given the source strings noted above and a query of spellcheck.q=solar+gl this yields the desired results:

solar glass
solar globe
like image 138
Alex Howansky Avatar answered Nov 02 '22 20:11

Alex Howansky


You may use the AnalyzingInfixLookupFactory or FreeTextLookupFactory

  • AnalyzingInfixLookupFactory returns the entire content of the field.
  • FreeTextLookupFactory returns a defined number of tokens.

More details and other suggester algorithms you will find here: http://alexbenedetti.blogspot.de/2015/07/solr-you-complete-me.html

Solr Configuration

<lst name="suggester">
  <str name="name">AnalyzingInfixSuggester</str>
  <str name="lookupImpl">AnalyzingInfixLookupFactory</str> 
  <str name="dictionaryImpl">DocumentDictionaryFactory</str>
  <str name="field">title</str>
  <str name="weightField">price</str>
  <str name="suggestAnalyzerFieldType">text_en</str>
</lst>

<lst name="suggester">
  <str name="name">FreeTextSuggester</str>
  <str name="lookupImpl">FreeTextLookupFactory</str> 
  <str name="dictionaryImpl">DocumentDictionaryFactory</str>
  <str name="field">title</str>
  <str name="ngrams">3</str>
  <str name="separator"> </str>
  <str name="suggestFreeTextAnalyzerFieldType">text_general</str>
</lst>
like image 1
Matthias M Avatar answered Nov 02 '22 20:11

Matthias M