I am trying to use a Solr search for some records having FirstName as;
abcd
Abcd
abcD
ABcd
abCd
abCD
Now I am trying to do a search with wildcard character support. I need to understand how exactly does the search work in terms of being case sensitive.
e.g. If I pass the FirstName parameter as ab* Vs Ab*, which records would be returned ?
Is there some way to make/force the search to be case-sensitive OR case-insensitive ?
By default, searches are case-insensitive. You can make your search case-sensitive by using the case filter. For example, the following search returns only results that match the term HelloWorld . It excludes results where the case doesn't match, such as helloWorld or helloworld . case:yes HelloWorld.
You can search for "solr" by loading the Admin UI Query tab, enter "solr" in the q param (replacing *:* , which matches all documents), and "Execute Query". See the Searching section below for more information. To index your own data, re-run the directory indexing command pointed to your own directory of documents.
It depends on how you define your fields in schema.xml . If you use LowerCaseFilterFactory while indexing and querying , then all queries will be case-insensitive. Otherwise it will be case-sensitive.
<filter class="solr.LowerCaseTokenizerFactory"/>
Default defined Fields in the solr schema works very differently.
data type 'string'
stores a word as an exact string not complete.
While 'text_general'
typically performs tokenization, and secondary processing (such as case insensitive and whole string match). it is very Useful for all scenarios when we want to match part of a sentence.
If the following sample, "Search into the sentence", is indexed to both fields we must search for exactly the Search into the sentence to get a hit from the string field, while it will return the different result in case of text_general.
Here seller name will be match exactly in the search string, while product name will be search into the whole sentence above.
Example:
<field name="seller_name" type="string" indexed="true" stored="true"/>
<field name="product_name" type="text_general" indexed="true" stored="true"/>
You configure it within your schema. For example:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="query">
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
means the field is considered lower case for queries this gives impression to be case-insensitive search.
By default, a value is matched exactly against the stored value. If you want a field to be case-insensitive, the usual way is to have a field type that uses a lowercase filter, making all the indexed content the same case and practically making the search case insensitive (since the query value also will be lowercased).
The example content does this for the 'text' and 'text_en' field types:
<filter class="solr.LowerCaseFilterFactory"/>
There is however a few particular areas where automagic handling of lowercasing for wild card queries may cause troubles, and MultitermQueryAnalysis was introduced in Solr 3.6 and 4.0 to handle those situations. 3.6 and 4.0 should do wild card search automagically the right way if the field is lowercased already.
I'd suggest lowercasing the name in the query (as long as you've applied the LowerCaseFilterFactory when indexing as well) when using wildcards if you're not getting the correct behaviour pre-3.6.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With