I use solr 5.0.0 and want to create an autocomplete functionality generating suggestions from the word-grams (or shingles) of my documents. The problem is that in return of a suggest-query I only get complete "terms" of the search field which can be extremly long.
CURRENT PROBLEM:
Input:"so" Suggestions: "......extremly long text son long text continuing......"
"......next long text solar next text continuing......"
GOAL:
Input: "so"
Suggestions with shingles:
"son"
"solar"
"solar test"
etc
<searchComponent name="suggest" class="solr.SuggestComponent"
enable="${solr.suggester.enabled:true}" >
<lst name="suggester">
<str name="name">mySuggester</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">title_and_description_suggest</str>
<str name="weightField">price</str>
<str name="suggestAnalyzerFieldType">autocomplete</str>
<str name="queryAnalyzerFieldType">autocomplete</str>
<str name="buildOnCommit">true</str>
</lst>
schema.xml:
<fieldType name="autocomplete" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" format="snowball"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="2" outputUnigrams="true" outputUnigramsIfNoShingles="true"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
I want to return max 3 words as autocomplete term. Is this possible with the SuggestComponent or how would you do it? No matter what I try I always receive the complete field value of matching documents.
Is that expected behaviour or what did I do wrong?
Many thanks in advance
In schema.xml define fieldType as follows:
<fieldType name="text_autocomplete" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" minShingleSize="2" maxShingleSize="5"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
In schema.xml define your field as follows:
<field name="example_field" type="text_autocomplete" indexed="true" stored="true"/>
Write your query as follows:
query?q=*&
rows=0&
facet=true&
facet.field=example_field&
facet.limit=-1&
wt=json&
indent=true&
facet.prefix=so
In the facet.prefix field, specify the term being searched for which you want suggestions ('so', in this example). If you need less than 5 words in the suggestion, reduce maxShingleSize in the fieldType definition accordingly. By default, you will get the results in decreasing order of their frequency of occurrence.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With