Suppose I have an index with nested document that looks like this:
{
    "id" : 1234
    "cars" : [{
            "id" : 987
            "name" : "Volkswagen"
        }, {
            "id": 988
            "name" : "Tesla"
        }
    ]
}
I now want to get a count aggregation of "car" documents that match a certain criteria, e.g. that match a search query. My initial attempt was the following query:
{
  "query" : {
    "nested" : {
      "path" : "cars",
      "query" : {
        "query_string" : {
          "fields" : ["cars.name"],
          "query" : "Tes*"
        }
      }
    }
  },
  "aggregations" : {
    "cars" :{
      "nested" : {
        "path" : "cars"
      },
      "aggs" : {
        "cars" : {
          "terms" : {
            "field" : "cars.id"
          }
        }
      }
    }
  }
}
I was hoping here to get an aggregation result with only the ids of cars whose name begin with "Tes". However, the aggregation instead uses all cars that are in a top-level document that also contains a matching nested documents. That is, in the above example "Volkswagen" would also be counted because the top-level document also contains a car that does match.
How can I get an aggregation of just the matching nested documents?
A special single bucket aggregation that enables aggregating nested documents. For example, lets say we have an index of products, and each product holds the list of resellers - each having its own price for the product.
Filters aggregationedit A multi-bucket aggregation where each bucket contains the documents that match a query. In the above example, we analyze log messages. The aggregation will build two collection (buckets) of log messages - one for all those containing an error, and another for all those containing a warning.
This aggregation provides a way to stream all buckets of a specific aggregation, similar to what scroll does for documents. The composite buckets are built from the combinations of the values extracted/created for each document and each combination is considered as a composite bucket.
Elasticsearch Aggregations provide you with the ability to group and perform calculations and statistics (such as sums and averages) on your data by using a simple search query. An aggregation can be viewed as a working unit that builds analytical information across a set of documents.
In the mean time I've figured it out: to achieve this a filter aggregation should be added around the the terms aggregation like so:
  "aggregations" : {
    "cars" :{
      "nested" : {
        "path" : "cars"
      },
      "aggs" : {
        "cars-filter" : {
          "filter" : {
            "query" : {
              "query_string" : {
                "fields" : ["cars.name"],
                "query" : "Tes*"
              }
            }  
          },
          "aggs" : {
            "cars" : {
              "terms" : {
                "field" : "cars.id"
              }
            }
          }
        }
      }
    }
  }
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With