I use the following simple query to search across documents in my Elastic index:
{
"query": { "query_string": { "query": "*test*" } },
"aggregations": {
"myaggregation": {
"terms": { "field": "myField.raw", "size": 0 }
}
}
}
This returns me the number of documents per distinct value of myField.raw
.
Since I'm interested into all actual documents than the total number, I tried to add the following top_hits
sub aggregation:
{
"query": { "query_string": { "query": "*test*" } },
"aggregations": {
"myaggregation": {
"terms": { "field": "myField.raw", "size": 0 },
"aggregations": {
"hits": {
"top_hits": { "size": 2000000 }
}
}
}
}
}
This ugly usage of top_hits
works, but is slow as hell.
Is there any proper way to fetch the actual documents for each bucket after doing the term
aggregation?
Have you considered using collapse
on field
?
It returns doc grouped under inner_hits (hits.hits[].inner_hits.<collapse-group-name>.hits.hits[]._source
)
Refer - https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-request-collapse.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With