Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Substring matches within SOLR

I can't seem to figure out how to find substring matches with SOLR, I've figured out matches based on a prefix so I can get ham to match hamburger.

How would I get a search for 'burger' to match hamburger as well? I tried burger but this tossed an error '*' or '?' not allowed as first character in WildcardQuery.

How can I match substrings using SOLR?

like image 417
Michael Avatar asked Jun 21 '10 20:06

Michael


3 Answers

If anyone ends up here after searching for "apachesolr substring", there's a simpler solution for this : https://drupal.stackexchange.com/a/27956/10419 (from https://drupal.stackexchange.com/questions/26024/how-can-i-make-search-with-a-substring-of-a-word)

Add ngramfilter to text type definition in schema.xml in solr config directory.

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" maxGramSize="25" />
like image 118
Paul Avatar answered Nov 09 '22 12:11

Paul


You can enable this but it will be very resource hungry (e.g. search for SuffixQuery).

See: http://lucene.472066.n3.nabble.com/Leading-Wildcard-Search-td522362.html

Quoting the mailing list: Work arounds? Imagine making a second index (or adding another field) with all of the terms spelled backwards.

=>

See Add ReverseStringFilter https://issues.apache.org/jira/browse/LUCENE-1398

and Support for efficient leading wildcards search: https://issues.apache.org/jira/browse/SOLR-1321

At the moment issues.apache.org seems down. Try to use e.g. google cache.

like image 31
Karussell Avatar answered Nov 09 '22 11:11

Karussell


As stated before in link you can use leading wildcards with edismax (ExtendedDismaxQParser). Just try it out to see if it is fast enough.

Some more info about the above mentioned reversedstring can also be found here: solr.ReversedWildcardFilterFactory

like image 20
Jem Avatar answered Nov 09 '22 13:11

Jem