Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting _id fields of aggregated records in Elastic Search

I am using ES to aggregate results based on a field. Additional to that, I would like to retrieve the _id of the records that went into each aggregated bucket as well. Is it possible ?

For example: for the following query

{
    "aggs" : {
        "genders" : {
            "terms" : { "field" : "gender" }
        }
    }
}

the response would be something like this

{
    ...

    "aggregations" : {
        "genders" : {
            "doc_count_error_upper_bound": 0, 
            "sum_other_doc_count": 0, 
            "buckets" : [ 
                {
                    "key" : "male",
                    "doc_count" : 14
                },
                {
                    "key" : "female",
                    "doc_count" : 14
                },
            ]
        }
    }
}

Now, here I want the _id of all the 14 male and 14 female records that make up the aggregation as well.

Why would I need that ?

Say, because I need to some post processing on these records i.e. insert a new field in those records based on their gender. Of course, its not as trivial as that, but my use case is something on that lines.

Thanks in advance !

like image 993
OneMoreError Avatar asked Jan 05 '23 19:01

OneMoreError


1 Answers

Create nested aggregation something like

{
    "aggs" : {
        "genders" : {
            "terms" : { "field" : "gender" }
        },
        "aggs": {
            "ids":{
                "terms" : {"field" : "_uid"}
            }
        }
    }
}
like image 80
Deadlock Avatar answered Mar 23 '23 10:03

Deadlock