Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to return all documents for each bucket after ElasticSearch term aggregation?

I use the following simple query to search across documents in my Elastic index:

{
    "query": { "query_string": { "query": "*test*" } },
    "aggregations": {
        "myaggregation": {
            "terms": { "field": "myField.raw", "size": 0 }
        }
    }
}

This returns me the number of documents per distinct value of myField.raw.

Since I'm interested into all actual documents than the total number, I tried to add the following top_hits sub aggregation:

{
    "query": { "query_string": { "query": "*test*" } },
    "aggregations": {
        "myaggregation": {
            "terms": { "field": "myField.raw", "size": 0 },
            "aggregations": {
                "hits": {
                    "top_hits": { "size": 2000000 }
                }
            }
        }
    }
}

This ugly usage of top_hits works, but is slow as hell.

Is there any proper way to fetch the actual documents for each bucket after doing the term aggregation?

like image 694
manu Avatar asked Jun 24 '15 14:06

manu


1 Answers

Have you considered using collapse on field?

It returns doc grouped under inner_hits (hits.hits[].inner_hits.<collapse-group-name>.hits.hits[]._source)

Refer - https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-request-collapse.html

like image 195
JBourne Avatar answered Oct 10 '22 17:10

JBourne