Why is ElasticSearch match query returning all results?

Tags:

I have the following ElasticSearch query which I would think would return all matches on the email field where it equals [email protected]

"query": {
  "bool": {
    "must": [
      {
        "match": {
          "email": "[email protected]"
      }
    }
  ]
}

}

The mapping for the user type that is being searched is the following:

    {
      "users": {
      "mappings": {
         "user": {
            "properties": {
               "email": {
                  "type": "string"
               },
               "name": {
                  "type": "string",
                  "fields": {
                     "raw": {
                        "type": "string",
                        "index": "not_analyzed"
                     }
                  }
               },
               "nickname": {
                  "type": "string"
               },
            }
         }
       }
   }  
     }

The following is a sample of results returned from ElasticSearch

 [{
    "_index": "users",
    "_type": "user",
    "_id": "54b19c417dcc4fe40d728e2c",
    "_score": 0.23983537,
    "_source": {
    "email": "[email protected]",
    "name": "John Smith",
    "nickname": "jsmith",
 },
 {
    "_index": "users",
    "_type": "user",
    "_id": "9c417dcc4fe40d728e2c54b1",
    "_score": 0.23983537,
    "_source": {
       "email": "[email protected]",
       "name": "Walter White",
       "nickname": "wwhite",
 },
 {
    "_index": "users",
    "_type": "user",
    "_id": "4fe40d728e2c54b19c417dcc",
    "_score": 0.23983537,
    "_source": {
       "email": "[email protected]",
       "name": "Jimmy Fallon",
       "nickname": "jfallon",
}]

From the above query, I would think this would need to have an exact match with '[email protected]' as the email property value.

How does the ElasticSearch DSL query need to change in order to only return exact matches on email.

891

asked Jan 12 '15 04:01

TheJediCowboy

1 Answers

The email field got tokenized , which is the reason for this anomaly. So what happened is when you indexed

"[email protected]" => [ "myemail" , "gmail.com" ]

This way if you search for myemail OR gmail.com you will get the match right. SO what happens is , when you search for [email protected] , the analyzer is also applied on search query. Hence its gets broken into

"[email protected]" => [ "john" , "gmail.com" ]

here as "gmail.com" token is common in search term and indexed term , you will get a match.

To over ride this behavior , declare the email; field as not_analyzed. There by the tokenization wont happen and the entire string will get indexed as such.

With "not_analyzed"

"[email protected]" => [ "[email protected]" ]

So modify the mapping to this and you should be good -

{
  "users": {
    "mappings": {
      "user": {
        "properties": {
          "email": {
            "type": "string",
            "index": "not_analyzed"
          },
          "name": {
            "type": "string",
            "fields": {
              "raw": {
                "type": "string",
                "index": "not_analyzed"
              }
            }
          },
          "nickname": {
            "type": "string"
          }
        }
      }
    }
  }
}

I have described the problem more precisely and another approach to solve it here.

108

answered Nov 17 '22 00:11

Vineeth Mohan

Related questions
                            
                                Would it benefit to pre-compile jade templates on production in express
                            
                                Elastic Search size to unlimited
                            
                                Response JSON object or JSON.stringify?
                            
                                Idiomatic successful callback in Node.js
                            
                                Node.js /socket.io/socket.io.js not found express 4.0
                            
                                ReferenceError: Can't find variable: require at
                            
                                How can I diagnose why `npm publish` quietly does... nothing?
                            
                                Double parameters with require: var io = require('socket.io')(http);
                            
                                Run multiple socket.io-clients in one node.js instance
                            
                                How to make assertions inside a promise when any errors thrown don't bubble up?
                            
                                What does the --save-dev option mean in npm install? [duplicate]
                            
                                Docker - Why is this express.js container with an exposed/published port reject connections? (using boot2docker)
                            
                                Kurento WebRTC not recording
                            
                                Synchronous forEach loop (wait for it to end)
                            
                                How can I take console input from a user in node.js?
                            
                                Redirect loop when activating CloudFlare
                            
                                Check if website is contactable
                            
                                What happens to the data when a docker container crashes
                            
                                how to avoid large numbers from converting to Exponential in nodejs excel file read
                            
                                Node.js + Express - Can't connect. ERR_CONNECTION_REFUSED

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is ElasticSearch match query returning all results?

Tags:

node.js

elasticsearch

elasticsearch-plugin

TheJediCowboy

People also ask

1 Answers

Vineeth Mohan

Recent Activity

Donate For Us