Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to sort nested documents in ElasticSearch?

Lets say I have the following mapping:

"site": {
  "properties": {
    "title":       { "type": "string" },
    "description": { "type": "string" },
    "category":    { "type": "string" },
    "tags":        { "type": "array" },
    "point":       { "type": "geo_point" }
    "localities":  { 
      type: 'nested',
      properties: {
        "title":       { "type": "string" },
        "description": { "type": "string" },
        "point":       { "type": "geo_point" }
      }
    }
  }
}

I'm then doing an "_geo_distance" sort on the parent document and am able to sort the documents on "site.point". However I would also like the nested localities to be sorted by "_geo_distance", inside the parent document.

Is this possible? If so, how?

like image 825
Yeggeps Avatar asked Mar 02 '12 14:03

Yeggeps


1 Answers

Unfortunately, no (at least not yet).

A query in ElasticSearch just identifies which documents match the query, and how well they match.

To understand what nested documents are useful for, consider this example:

{
    "title":    "My post",
    "body":     "Text in my body...",
    "followers": [
        {
            "name":     "Joe",
            "status":   "active"
        },
        {
            "name":     "Mary",
            "status":   "pending"
        },
    ]
}        

The above JSON, once indexed in ES, is functionally equivalent to the following. Note how the followers field has been flattened:

{
    "title":            "My post",
    "body":             "Text in my body...",
    "followers.name":   ["Joe","Mary"],
    "followers.status": ["active","pending"]
}        

A search for: followers with status == active and name == Mary would match this document... incorrectly.

Nested fields allow us to work around this limitation. If the followers field is declared to be of type nested instead of type object then its contents are created as a separate (invisible) sub-document internally. That means that we can use a nested query or nested filter to query these nested documents as individual docs.

However, the output from the nested query/filter clauses only tells us if the main doc matches, and how well it matches. It doesn't even tell us which of the nested docs matched. To figure that out, we'd have to write code in our application to check each of the nested docs against our search criteria.

There are a few open issues requesting the addition of these features, but it is not an easy problem to solve.

The only way to achieve what you want is to index your sub-docs as separate documents, and to query and sort them independently. It may be useful to establish a parent-child relationship between the main doc and these separate sub-docs. (see parent-type mapping, the Parent & Child section of the index api docs, and the top-children and has-child queries.

Also, an ES user has mailed the list about a new has_parent filter that they are currently working on in a fork. However, this is not available in the main ES repo yet.

like image 199
DrTech Avatar answered Oct 05 '22 23:10

DrTech