Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django-Haystack with Solr contains search

I am using haystack within a project using solr as the backend. I want to be able to perform a contains search, similar to the Django .filter(something__contains="...")

The __startswith option does not suit our needs as it, as the name suggests, looks for words that start with the string.

I tried to use something like *keyword* but Solr does not allow the * to be used as the first character

Thanks.

like image 828
neolaser Avatar asked Jun 14 '11 00:06

neolaser


1 Answers

To get "contains" functionallity you can use:

<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="100" side="back"/>
<filter class="solr.LowerCaseFilterFactory" />

as index analyzer.

This will create ngrams for every whitespace separated word in your field. For example:

"Index this!" => x, ex, dex, ndex, index, !, s!, is!, his!, this!

As you see this will expand your index greatly but if you now enter a query like:

"nde*"

it will match "ndex" giving you a hit.

Use this approach carefully to make sure that your index doesn't get too large. If you increase minGramSize, or decrease maxGramSize it will not expand the index as mutch but reduce the "contains" functionallity. For instance setting minGramSize="3" will require that you have at least 3 characters in your contains query.

like image 191
lindstromhenrik Avatar answered Oct 11 '22 23:10

lindstromhenrik