I’m doing some faceted searches but have a few problems. I don’t get the desired results when there are several words in the faceted search field. Example: “animal” field with the following entries: <pre class="prettyprint"><code> A horse Black horse Black horse </code></pre> La faceted search sends back "horse(3)" as best result, whereas I would like to get back "Black horse(2)". And this is the schema.xml. The search field is BUSQUEDA, and the faceted field is SUPERFICIE. I think I have tried most of the posible combinations of the defined types for these two fields but still doesn't work. <pre class="prettyprint"><code><?xml version="1.0" encoding="UTF-8" ?> <schema name="example" version="1.2"> <types> <fieldType name="string" class="solr.StrField"/> <fieldType name="facet_texPersonal" class="solr.StrField" sortMissingLast="true" omitNorms="true"> <analyzer> <tokenizer class="solr.KeywordTokenizerFactory"/> </analyzer> </fieldType> <fieldType name="facet_tex" class="solr.TextField" sortMissingLast="true" omitNorms="true"> <analyzer> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.TrimFilterFactory" /> </analyzer> </fieldType> <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> </analyzer> </fieldType> <fieldType name="textTight" class="solr.TextField" positionIncrementGap="100" > <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldType> <fieldType name="textMultidioma" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> </types> <fields> <field name="BUSQUEDA" type="facet_tex" indexed="true" stored="true"/> <field name="SUPERFICIE" type="facet_tex" indexed="true" stored="true"/> <field name="NOMBRE" type="string" indexed="true" stored="true"/> </fields> <uniqueKey>NOMBRE</uniqueKey> <defaultSearchField>BUSQUEDA</defaultSearchField></schema> </code></pre> Any suggestions? Thanks a bunch in advance!

We had multi-word faceted fields working for a project that I worked on previously. Here is (part of) the schema.xml relating to this: <pre class="prettyprint"><code><schema name="example" version="1.2"> <types> <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true" /> ... </types> <fields> <field name="grant_type" type="string" indexed="true" stored="true" /> ... </fields> </schema> </code></pre> As Mauricio has highlighted the facet field has to be non-tokenized (not split in to separate words). In the config above we are using the 'solr.StrField' (non-tokenized) field type. Further hints for faceted field types (not converting to lowercase, not stripping out punctuation, etc.) can be found on the Solr Faceting Overview page.

Problem with faceted search

Tags:

solr

I’m doing some faceted searches but have a few problems. I don’t get the desired results when there are several words in the faceted search field.

Example: “animal” field with the following entries:

        A horse

        Black horse

        Black horse

La faceted search sends back "horse(3)" as best result, whereas I would like to get back "Black horse(2)".

And this is the schema.xml. The search field is BUSQUEDA, and the faceted field is SUPERFICIE. I think I have tried most of the posible combinations of the defined types for these two fields but still doesn't work.

<?xml version="1.0" encoding="UTF-8" ?>
        <schema name="example" version="1.2">
         <types>

     <fieldType name="string" class="solr.StrField"/>

    <fieldType name="facet_texPersonal" class="solr.StrField" sortMissingLast="true" omitNorms="true">
           <analyzer>
            <tokenizer class="solr.KeywordTokenizerFactory"/>
           </analyzer>
          </fieldType>

          <fieldType name="facet_tex" class="solr.TextField" sortMissingLast="true" omitNorms="true">
           <analyzer>
            <tokenizer class="solr.KeywordTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory" />
            <filter class="solr.TrimFilterFactory" />
           </analyzer>
          </fieldType>

          <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
           <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
             enablePositionIncrements="true"/>
            <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" 
             catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
           </analyzer>
           <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" 
             enablePositionIncrements="true"/>
            <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" 
             catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
           </analyzer>
          </fieldType>

          <fieldType name="textTight" class="solr.TextField" positionIncrementGap="100" >
            <analyzer>
           <tokenizer class="solr.WhitespaceTokenizerFactory"/>
           <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
           <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
           <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0"        catenateWords="1" catenateNumbers="1" catenateAll="0"/>
           <filter class="solr.LowerCaseFilterFactory"/>
           <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
           <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
            </analyzer>
          </fieldType>

          <fieldType name="textMultidioma" class="solr.TextField" positionIncrementGap="100">
           <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" 
              enablePositionIncrements="true" />
            <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" 
              catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
            <filter class="solr.LowerCaseFilterFactory"/>
           </analyzer>
           <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
            <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" 
             catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
            <filter class="solr.LowerCaseFilterFactory"/>
           </analyzer>
          </fieldType>

         </types>

         <fields>
          <field name="BUSQUEDA" type="facet_tex" indexed="true" stored="true"/>
          <field name="SUPERFICIE" type="facet_tex" indexed="true" stored="true"/>
          <field name="NOMBRE" type="string" indexed="true" stored="true"/>
         </fields>
         <uniqueKey>NOMBRE</uniqueKey>
         <defaultSearchField>BUSQUEDA</defaultSearchField></schema>

Any suggestions?

Thanks a bunch in advance!

721

asked Feb 08 '10 16:02

Carlos

2 Answers

You have to facet on a non-tokenized field (field class solr.StrField, or using solr.KeywordTokenizerFactory). This thread explains it in detail.

129

answered Oct 23 '22 05:10

Mauricio Scheffer

We had multi-word faceted fields working for a project that I worked on previously. Here is (part of) the schema.xml relating to this:

<schema name="example" version="1.2">
 <types>
  <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true" />
    ...
 </types>  
 <fields>
  <field name="grant_type" type="string" indexed="true" stored="true" />
  ...
 </fields>
</schema>

As Mauricio has highlighted the facet field has to be non-tokenized (not split in to separate words). In the config above we are using the 'solr.StrField' (non-tokenized) field type.

Further hints for faceted field types (not converting to lowercase, not stripping out punctuation, etc.) can be found on the Solr Faceting Overview page.

answered Oct 23 '22 04:10

Jonathan Williams

Related questions
                            
                                What's wrong with this Solr range filter query?
                            
                                Solr query to filter document with at least one value in array except of specified values
                            
                                Position of document in result set in Solr
                            
                                Hierarchical faceted search example with Solr
                            
                                sunspot_rails unable to index due to "404 Not Found" error
                            
                                Matching entire sentence with spaces in lucene BooleanQuery
                            
                                How to do partial beginning matches in Solr?
                            
                                Wildcard search in Solr
                            
                                get all results with Dismax, like q=*:*?
                            
                                Order by an expression in Solr
                            
                                How can I write a request spec with Capybara/RSpec for testing Sunspot/Solr searching?
                            
                                DisMax to parse user queries and q or fq to filter results
                            
                                Can solr return function values (not solr score or document fields)?
                            
                                Exception:Can't find class SolrCoreAware
                            
                                Where are solr.xml, solrconfig.xml and schema.xml located?
                            
                                Solrj with Solr Suggester
                            
                                Indexing documents using Solr results in Expected mime type application/octet-stream but got text/html
                            
                                Search for document in Solr where a multivalue field is either empty or has a specific value
                            
                                Solr suggester no results
                            
                                Apache Solr search autocomplete

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With