Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performing EXACT match on SOLR search

Tags:

solr

I am implementing a SOLR search. When I type in e.g Richard Chase I get all the Richards in the index and all the Chases, like Johnny Chase etc.. when actually I only want to return all the names that match BOTH Richard AND Chase.

my config settings are

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <!-- in this example, we will only use synonyms at query time
    <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
    -->
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

and my query searches text field

text:Richard Chase

any ideas what I'm doing wrong?

like image 682
fredseagul Avatar asked Aug 14 '13 06:08

fredseagul


2 Answers

You are using StandardTokenizerFactory, which adheres to Word Boundary rules.

This would mean that your words get split on spaces.

if you want a real exact match, i.e

Richard Chase to return documents containing only Richard Chase exactly, then you should you KeywordTokenizerFactory.

But as you mention, you want Richard John Chase but not Johnny Chase, it tells me that you want matches for Richard and Chase.

You could either search for Richard AND Chase or change your default operator in schema.xml to be AND instead of OR. Beware that this setting is global.

like image 163
Srikanth Venugopalan Avatar answered Sep 22 '22 09:09

Srikanth Venugopalan


You have to use PhraseQuery (text:"Richard Chase") to get documents where both Ricahard and Chase are near to each other. If you want also to find, say, Richard X. Chase you can use text:"richard chase"~1.

See http://www.solrtutorial.com/solr-query-syntax.html

like image 34
Konstantin Gribov Avatar answered Sep 22 '22 09:09

Konstantin Gribov