Say, I have following documents:
1st doc:
{
productName: "product1",
tags: [
{
"name":"key1",
"value":"value1"
},
{
"name":"key2",
"value":"value2"
}
]
}
2nd doc:
{
productName: "product2",
tags: [
{
"name":"key1",
"value":"value1"
},
{
"name":"key2",
"value":"value3"
}
]
}
I know if I want to group by productName, I could use a terms
aggregation
"terms": {
"field": "productName"
}
which will give me two buckets with two different keys "product1", "product2".
However, what should the query be if I would like to group by tag key? i.e. I would like to group by tag with name==key1, then I am expecting one bucket with key="value1"; while if I group by tag with name==key2, I am expecting the result to be two buckets with keys "value2", "value3".
What should the query look like if I would like to group by the 'value' inside a nested array but not group by the 'key'? Any suggestion?
It sounds like a nested terms aggregation is what you're looking for.
With the two documents you posted, this query:
POST /test_index/_search
{
"size": 0,
"aggs": {
"product_name_terms": {
"terms": {
"field": "product_name"
}
},
"nested_tags": {
"nested": {
"path": "tags"
},
"aggs": {
"tags_name_terms": {
"terms": {
"field": "tags.name"
}
},
"tags_value_terms": {
"terms": {
"field": "tags.value"
}
}
}
}
}
}
returns this:
{
"took": 67,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0,
"hits": []
},
"aggregations": {
"product_name_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": []
},
"nested_tags": {
"doc_count": 4,
"tags_name_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "key1",
"doc_count": 2
},
{
"key": "key2",
"doc_count": 2
}
]
},
"tags_value_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "value1",
"doc_count": 2
},
{
"key": "value2",
"doc_count": 1
},
{
"key": "value3",
"doc_count": 1
}
]
}
}
}
}
Here is some code I used to test it:
http://sense.qbox.io/gist/a9a172f41dbd520d5e61063a9686055681110522
EDIT: Filter by Nested Value
As per your comment, if you want to filter the nested results by a value (of the nested results), you can add another "layer" of aggregation making use of the filter aggregation as follows:
POST /test_index/_search
{
"size": 0,
"aggs": {
"nested_tags": {
"nested": {
"path": "tags"
},
"aggs": {
"filter_tag_name": {
"filter": {
"term": {
"tags.name": "key1"
}
},
"aggs": {
"tags_name_terms": {
"terms": {
"field": "tags.name"
}
},
"tags_value_terms": {
"terms": {
"field": "tags.value"
}
}
}
}
}
}
}
}
which returns:
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0,
"hits": []
},
"aggregations": {
"nested_tags": {
"doc_count": 4,
"filter_tag_name": {
"doc_count": 2,
"tags_name_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "key1",
"doc_count": 2
}
]
},
"tags_value_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "value1",
"doc_count": 2
}
]
}
}
}
}
}
Here's the updated code:
http://sense.qbox.io/gist/507c3aabf36b8f6ed8bb076c8c1b8552097c5458
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With