Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get ElasticSearch aggregations to count the parent documents instead of the nested documents

My ElasticSearch index has nested documents to indicate the places where various events occurred related to the document. I am using aggregations to get facets of the places. The count returned is the count of the number of occurrences of the place. For example, if a document has a birth and death place of California, the aggregation count for California is 2. I would like the aggregation count to be the number of documents containing a particular place, rather than the number of child documents containing the place. The relevant part of my schema looks like this:

"mappings": {
    "document": {
        "properties": {
            "docId" : { "type": "keyword" },
            "place": {
                "type": "nested",
                "properties": {
                    "id": { "type": "keyword" },
                    "type": { "type": "keyword" },
                    "loc": { "type" : "geo_point" },
                    "text": { 
                        "type": "text",
                        "analyzer": "english",
                        "copy_to" : "text"
                    }
                },
                "dynamic": false
            }
        }
    }
}

I can get facets with a simple aggregation like this, which retrieves the places with type place.vital.* (e.g. place.vital.birth, place.vital.death, etc), but counts the number of nested documents, not the number of parent documents.

"aggs": {
"place.vital": {
  "aggs": {
    "types": {
      "aggs": {
        "values": {
          "terms": {
            "field": "place.id"
          }
        }
      },
      "terms": {
        "field": "place.type",
        "include": "place\\.vital\\..*"
      }
    }
  },
  "nested": {
    "path": "place"
  }
}

Is it possible to tweak my aggregation so that it only counts each parent document once?

like image 212
Robert Wille Avatar asked Sep 20 '25 19:09

Robert Wille


1 Answers

Use reverse nested aggregation. This will then create an aggregation with the nested counts and a sub aggregation with the parent counts.

See how to return the count of unique documents by using elasticsearch aggregation for more detail.

like image 81
richardwhatever Avatar answered Sep 23 '25 05:09

richardwhatever