I use the following simple query to search across documents in my Elastic index:
{
"query": { "query_string": { "query": "*test*" } },
"aggregations": {
"myaggregation": {
"terms": { "field": "myField.raw", "size": 0 }
}
}
}
This returns me the number of documents per distinct value of myField.raw.
Since I'm interested into all actual documents than the total number, I tried to add the following top_hits sub aggregation:
{
"query": { "query_string": { "query": "*test*" } },
"aggregations": {
"myaggregation": {
"terms": { "field": "myField.raw", "size": 0 },
"aggregations": {
"hits": {
"top_hits": { "size": 2000000 }
}
}
}
}
}
This ugly usage of top_hits works, but is slow as hell.
Is there any proper way to fetch the actual documents for each bucket after doing the term aggregation?
Have you considered using collapse on field?
It returns doc grouped under inner_hits (hits.hits[].inner_hits.<collapse-group-name>.hits.hits[]._source)
Refer - https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-request-collapse.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With