My documents are structured in the following way:
{
"chefInfo": {
"id": int,
"employed": String
... Some more recipe information ...
}
"recipe": {
... Some recipe information ...
}
}
If a chef has multiple recipes, the nested chefInfo
block will be identical in each document. My problem is that I want to do an aggregation of a field in the chefInfo
part of the document. However, this doesn't take into account for the fact that the chefInfo
block is a duplicate.
So, if the chef with the id of 1 is on 5 recipes and I am aggregating on the employed
field then this particular chef, will represent 5 of the counts in the aggregation, whereas, I want them to only count a single one.
I thought about doing a top_hits
aggregation on the chef_id and then I wanted to do a sub-aggregation over all of the buckets but I can't work out how to do the counts over the results of all the buckets.
Is it possible what I want to do?
A top_hits metric aggregator keeps track of the most relevant document being aggregated. This aggregator is intended to be used as a sub aggregator, so that the top matching documents can be aggregated per bucket.
Elasticsearch Aggregations provide you with the ability to group and perform calculations and statistics (such as sums and averages) on your data by using a simple search query. An aggregation can be viewed as a working unit that builds analytical information across a set of documents.
A search consists of one or more queries that are combined and sent to Elasticsearch. Documents that match a search's queries are returned in the hits, or search results, of the response.
sum_other_doc_count is the number of documents that didn't make it into the the top size terms.
For elastic every document in itself is unique. In your case you want to define uniqueness based on a different field, here chefInfo.id
. To find unique count based on this field you have to make use of cardinality aggregation.
You can apply the aggregation as below:
{
"aggs": {
"employed": {
"nested": {
"path": "chefInfo"
},
"aggs": {
"employed": {
"terms": {
"field": "chefInfo.employed.keyword"
},
"aggs": {
"employed_unique": {
"cardinality": {
"field": "chefInfo.id"
}
}
}
}
}
}
}
}
In the result employed_unique
give you the expected count.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With