Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement case sensitive search in elasticsearch?

I have a field in my indexed documents where i need to search with case being sensitive. I am using the match query to fetch the results. An example of my data document is :

{
  "name" : "binoy",
  "age" : 26,
  "country": "India"
}

Now when I give the following query:

{
  “query” : {
    “match” : {
      “name” : “Binoy"
    }
  }
}

It gives me a match for "binoy" against "Binoy". I want the search to be case sensitive. It seems by default,elasticsearch seems to go with case being insensitive. How to make the search case sensitive in elasticsearch?

like image 754
Binoy Bhanujan Avatar asked May 22 '15 06:05

Binoy Bhanujan


3 Answers

In the mapping you can define the field as not_analyzed.

curl -X PUT "http://localhost:9200/sample" -d '{
  "index": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}'

echo
curl -X PUT "http://localhost:9200/sample/data/_mapping" -d '{
  "data": {
    "properties": {
      "name": {
        "type": "string",
        "index": "not_analyzed"
      }
    }
  }
}'

Now if you can do normal index and do normal search , it wont analyze it and make sure it deliver case insensitive search.

like image 74
Vineeth Mohan Avatar answered Oct 10 '22 22:10

Vineeth Mohan


It depends on the mapping you have defined for you field name. If you haven't defined any mapping then elasticsearch will treat it as string and use the standard analyzer (which lower-cases the tokens) to generate tokens. Your query will also use the same analyzer for search hence matching is done by lower-casing the input. That's why "Binoy" matches "binoy"

To solve it you can define a custom analyzer without lowercase filter and use it for your field name. You can define the analyzer as below

"analyzer": {
                "casesensitive_text": {
                    "type":         "custom",
                    "tokenizer":    "standard",
                    "filter": ["stop", "porter_stem" ]
                }
            }

You can define the mapping for name as below

"name": {
    "type": "string", 
    "analyzer": "casesensitive_text"
}

Now you can do the the search on name.

note: the analyzer above is for example purpose. You may need to change it as per your needs

like image 41
Prabin Meitei Avatar answered Oct 10 '22 21:10

Prabin Meitei


Have your mapping like:

PUT /whatever
{
  "settings": {
    "analysis": {
      "analyzer": {
        "mine": {
          "type": "custom",
          "tokenizer": "standard"
        }
      }
    }
  },
  "mappings": {
    "type": {
      "properties": {
        "name": {
          "type": "string",
          "analyzer": "mine"
        }
      }
    }
  }
}

meaning, no lowercase filter for that custom analyzer.

like image 28
Andrei Stefan Avatar answered Oct 10 '22 22:10

Andrei Stefan