Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solr wildcard searching

Tags:

solr

If I have a record with keywords Chris Muench, I want to be able to match Mue or Chr. How can I do this with a solr query. Currently I do the following:

$results = $solr->search('"'.Apache_Solr_Service::escape($_GET['textsearch']).'"~100', 0, 100, array('fq' => 'type:datacollection'));

It doesn't match Mue or Chr, but it does match Muench

Schema:

<?xml version="1.0" encoding="UTF-8" ?>
<schema name="rocdocs" version="1.4">
  <types>
    <!-- The StrField type is not analyzed, but indexed/stored verbatim. -->
    <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
    <fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>
 </types>


 <fields>
    <field name="type" type="string" indexed="true" stored="true" required="true" />
    <field name="mongo_id" type="string" indexed="true" stored="true" required="true" />
    <field name="nid" type="int" indexed="true" stored="true" required="true" />
    <field name="keywords" type="text_general" indexed="true" stored="false" />
 </fields>

 <!-- Field to use to determine and enforce document uniqueness. 
      Unless this field is marked with required="false", it will be a required field
   -->
 <uniqueKey>mongo_id</uniqueKey>

 <!-- field for the QueryParser to use when an explicit fieldname is absent -->
 <defaultSearchField>keywords</defaultSearchField>
 <!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
 <solrQueryParser defaultOperator="OR"/>
</schema>
like image 649
Chris Muench Avatar asked Sep 04 '12 16:09

Chris Muench


1 Answers

You need to either use wildcard queries e.g. chr* or mue* which would match.
This would either client to either enter the query in this format or modifying it in the application.
Else, you can generate tokens using solr.EdgeNGramFilterFactory and this would match the records. e.g. chris would generate ch, chr, chri, chris and hence would match all these combination.

like image 164
Jayendra Avatar answered Nov 16 '22 03:11

Jayendra