Suppose I have an index with nested document that looks like this:
{
"id" : 1234
"cars" : [{
"id" : 987
"name" : "Volkswagen"
}, {
"id": 988
"name" : "Tesla"
}
]
}
I now want to get a count aggregation of "car" documents that match a certain criteria, e.g. that match a search query. My initial attempt was the following query:
{
"query" : {
"nested" : {
"path" : "cars",
"query" : {
"query_string" : {
"fields" : ["cars.name"],
"query" : "Tes*"
}
}
}
},
"aggregations" : {
"cars" :{
"nested" : {
"path" : "cars"
},
"aggs" : {
"cars" : {
"terms" : {
"field" : "cars.id"
}
}
}
}
}
}
I was hoping here to get an aggregation result with only the ids of cars whose name begin with "Tes". However, the aggregation instead uses all cars that are in a top-level document that also contains a matching nested documents. That is, in the above example "Volkswagen" would also be counted because the top-level document also contains a car that does match.
How can I get an aggregation of just the matching nested documents?
A special single bucket aggregation that enables aggregating nested documents. For example, lets say we have an index of products, and each product holds the list of resellers - each having its own price for the product.
Filters aggregationedit A multi-bucket aggregation where each bucket contains the documents that match a query. In the above example, we analyze log messages. The aggregation will build two collection (buckets) of log messages - one for all those containing an error, and another for all those containing a warning.
This aggregation provides a way to stream all buckets of a specific aggregation, similar to what scroll does for documents. The composite buckets are built from the combinations of the values extracted/created for each document and each combination is considered as a composite bucket.
Elasticsearch Aggregations provide you with the ability to group and perform calculations and statistics (such as sums and averages) on your data by using a simple search query. An aggregation can be viewed as a working unit that builds analytical information across a set of documents.
In the mean time I've figured it out: to achieve this a filter aggregation should be added around the the terms aggregation like so:
"aggregations" : {
"cars" :{
"nested" : {
"path" : "cars"
},
"aggs" : {
"cars-filter" : {
"filter" : {
"query" : {
"query_string" : {
"fields" : ["cars.name"],
"query" : "Tes*"
}
}
},
"aggs" : {
"cars" : {
"terms" : {
"field" : "cars.id"
}
}
}
}
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With