I’m doing some faceted searches but have a few problems. I don’t get the desired results when there are several words in the faceted search field.
Example: “animal” field with the following entries:
A horse
Black horse
Black horse
La faceted search sends back "horse(3)" as best result, whereas I would like to get back "Black horse(2)".
And this is the schema.xml. The search field is BUSQUEDA, and the faceted field is SUPERFICIE. I think I have tried most of the posible combinations of the defined types for these two fields but still doesn't work.
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="example" version="1.2">
<types>
<fieldType name="string" class="solr.StrField"/>
<fieldType name="facet_texPersonal" class="solr.StrField" sortMissingLast="true" omitNorms="true">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
</analyzer>
</fieldType>
<fieldType name="facet_tex" class="solr.TextField" sortMissingLast="true" omitNorms="true">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.TrimFilterFactory" />
</analyzer>
</fieldType>
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1"
catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1"
catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
</analyzer>
</fieldType>
<fieldType name="textTight" class="solr.TextField" positionIncrementGap="100" >
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="textMultidioma" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1"
catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1"
catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
</types>
<fields>
<field name="BUSQUEDA" type="facet_tex" indexed="true" stored="true"/>
<field name="SUPERFICIE" type="facet_tex" indexed="true" stored="true"/>
<field name="NOMBRE" type="string" indexed="true" stored="true"/>
</fields>
<uniqueKey>NOMBRE</uniqueKey>
<defaultSearchField>BUSQUEDA</defaultSearchField></schema>
Any suggestions?
Thanks a bunch in advance!
Faceted search uses product or content features as criteria for a website visitor to refine their search results. Your user will get specific and relevant options to filter their result page. This makes faceted search an easy and practical way to search for products or pages.
Faceted search allows users to reduce the number of search results quickly to get to the desired item(s). Showing narrowing options (facets) is easier for users because they don't have to know the syntax necessary to specify their search precisely.
Faceted search is a technique that involves augmenting traditional search techniques with a faceted navigation system, allowing users to narrow down search results by applying multiple filters based on faceted classification of the items. It is sometimes referred to as a parametric search technique.
Faceted navigation systems operate by creating a new URL for every filtered search. They'll either dynamically generate the URL, creating something like the one used in our example. Or they'll append parameters that specify how the category URL is behaving (more on this later).
You have to facet on a non-tokenized field (field class solr.StrField, or using solr.KeywordTokenizerFactory). This thread explains it in detail.
We had multi-word faceted fields working for a project that I worked on previously. Here is (part of) the schema.xml relating to this:
<schema name="example" version="1.2">
<types>
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true" />
...
</types>
<fields>
<field name="grant_type" type="string" indexed="true" stored="true" />
...
</fields>
</schema>
As Mauricio has highlighted the facet field has to be non-tokenized (not split in to separate words). In the config above we are using the 'solr.StrField' (non-tokenized) field type.
Further hints for faceted field types (not converting to lowercase, not stripping out punctuation, etc.) can be found on the Solr Faceting Overview page.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With