I use solr 3.6 and I would like to use collations from suggester as a autocomplete solution for multi term searches. Unfortunately the Suggester returns only one collation for a multi term search, even if a lot of suggestions for each single term exists. Depending on my test searches and the underlying indexed data I'm sure that more collations must exist.
Is something wrong with my Suggester configuration?
<!--configuration -->
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.fst.WFSTLookupFactory</str>
<str name="field">text</str> <!-- the indexed field to derive suggestions from -->
<!--<float name="threshold">0.0005</float> disabled for test-->
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">200</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.maxCollations">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
Example response for q=bio+ber :
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">4</int>
</lst>
<lst name="spellcheck">
<lst name="suggestions">
<lst name="bio">
<int name="numFound">27</int>
<int name="startOffset">0</int>
<int name="endOffset">3</int>
<arr name="suggestion">
<str>bio</str>
<str>bio-estetica</str>
<str>bio-kosmetik</str>
...
</arr>
</lst>
<lst name="ber">
<int name="numFound">81</int>
<int name="startOffset">4</int>
<int name="endOffset">7</int>
<arr name="suggestion">
<str>beratung</str>
<str>bern</str>
...
</arr>
</lst>
<str name="collation">bio beratung</str>
</lst>
</lst>
</response>
I was having the same problem as you, and I managed to solve it. It turns out there are several things you need to know in order to get multiple collations to work properly.
First, you must specify a QueryComponent
under the components
list of the "suggest" requestHandler
in your solrconfig.xml
. Otherwise your requestHandler
does not know how to query the index, so it can't figure out how many hits each corrected query has, so you'll only get one. If you had added spellcheck.collateExtendedResults=true
to your query, you would have seen that the hits
were 0, which shows that Solr didn't bother to check the corrected query against the index.
They hint at this with a somewhat opaque error message:
INFO: Could not find an instance of QueryComponent. Disabling collation verification against the index.
The easiest way to add it is to use the default QueryComponent
, which is called "query." So in the XML you posted above, you'd change the "components" part to:
<arr name="components">
<str>suggest</str>
<str>query</str>
</arr>
Secondly, you need to set spellcheck.maxCollations
to be more than 1 (duh), and less intuitively, you need to set spellcheck.maxCollationTries
to be some large number (e.g. 1000). If either of these are set at the defaults (both 0), then Solr will only give you one collation. Also, you need to set spellcheck.count
to be greater than 1.
Thirdly, you need to modify the query to include the field you want to search against, and the terms must be surrounded by quotes to ensure proper collation. So in the case of your query:
q=bio+ber
This really should be:
q=text:"bio+ber"
Obviously in your case, "text" is the default field, so you don't need it. But in my case, I was using a non-default field, so I had to specify it. Otherwise, Solr would count the hits against the "text" field, and all the results would have 0 hits, so the ranking would be useless.
So in my case, the query looked like this:
q=my_field:"brain+c"
&spellcheck.count=5
&spellcheck.maxCollations=10
&spellcheck.maxCollationTries=1000
&spellcheck.collateExtendedResults=true
And my response looked like this:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">4</int>
</lst>
<lst name="spellcheck">
<lst name="suggestions">
<lst name="brain">
<int name="numFound">1</int>
<int name="startOffset">15</int>
<int name="endOffset">20</int>
<arr name="suggestion">
<str>brain</str>
</arr>
</lst>
<lst name="c">
<int name="numFound">4</int>
<int name="startOffset">21</int>
<int name="endOffset">23</int>
<arr name="suggestion">
<str>cancer</str>
<str>cambrian</str>
<str>contusion</str>
<str>cells</str>
</arr>
</lst>
<lst name="collation">
<str name="collationQuery">my_field:"brain cancer"</str>
<int name="hits">2</int>
<lst name="misspellingsAndCorrections">
<str name="brain">brain</str>
<str name="c">cancer</str>
</lst>
</lst>
<lst name="collation">
<str name="collationQuery">my_field:"brain contusion"</str>
<int name="hits">1</int>
<lst name="misspellingsAndCorrections">
<str name="brain">brain</str>
<str name="c">contusion</str>
</lst>
</lst>
<lst name="collation">
<str name="collationQuery">my_field:"brain cells"</str>
<int name="hits">1</int>
<lst name="misspellingsAndCorrections">
<str name="brain">brain</str>
<str name="c">cells</str>
</lst>
</lst>
</lst>
</lst>
<result name="response" numFound="0" start="0"/>
</response>
Success!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With