Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In ElasticSearch, how do I filter the nested documents in my result?

Suppose, in ElasticSearch 5, I have data with nesting like:

{"number":1234, "names": [ 
  {"firstName": "John", "lastName": "Smith"}, 
  {"firstName": "Al", "lastName": "Jones"}
]},  
...

And I want to query for hits with number 1234 but return only the names that match "lastName": "Jones", so that my result omits names that don't match. In other words, I want to get back only part of the matching document, based on a term query or similar.

A simple nested query won't do, as such would be filtering top-level results. Any ideas?

{ "query" : { "bool": { "filter":[
    { "term": { "number":1234} },
    ????  something with "lastName": "Jones" ????
] } } }

I want back:

hits: [
   {"number":1234, "names": [ 
     {"firstName": "Al", "lastName": "Jones"}
   ]},  
   ...
]
like image 627
Patrick Szalapski Avatar asked Aug 08 '17 14:08

Patrick Szalapski


Video Answer


1 Answers

hits section returns a _source - this is exactly the same document you have indexed.

You are right, nested query filters top-level results, but with inner_hits it will show you which inner nested objects caused these top-level documents to be returned, and this is exactly what you need.

names field can be excluded from top-level hits using _source parameter.

{
   "_source": {
      "excludes": ["names"]
   },
   "query":{
      "bool":{
         "must":[
            {
               "term":{
                  "number":{
                     "value":"1234"
                  }
               }
            },
            {
               "nested":{
                  "path":"names",
                  "query":{
                     "term":{
                        "names.lastName":"Jones"
                     }
                  },
                  "inner_hits":{
                  }
               }
            }
         ]
      }
   }
}

So now top-level documents are returned without names field, and you have an additional inner_hits section with the names that match.
You should treat nested objects as part of a top-level document. If you really need them to be separate - consider parent/child relations.

like image 125
Taras Kohut Avatar answered Sep 20 '22 05:09

Taras Kohut