Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch: aggregation on _score field?

I would like to use the stats or extended_stats aggregation on the _score field but can't find any examples of this being done (i.e., seems like you can only use aggregations with actual document fields).

Is it possible to request aggregations on calculated "metadata" fields for each hit in an ElasticSearch query response (e.g., _score, _type, _shard, etc.)?

I'm assuming the answer is 'no' since fields like _score aren't indexed...

like image 664
Clint Harris Avatar asked Jul 03 '14 15:07

Clint Harris


People also ask

How do you aggregate text fields in Elasticsearch?

Aggregations on text fields By default, Elasticsearch doesn't support aggregations on a text field. Because text fields are tokenized, an aggregation on a text field has to reverse the tokenization process back to its original string and then formulate an aggregation based on that.

Can Kibana perform aggregation across fields that contain nested objects?

But visualizations in Kibana don't aggregate on nested fields like that, regardless of how you set your mappings -- if you want to run aggregations on the data in the items list, you aren't going to get the results you are looking for. Then doing the same sum aggregation should return the expected results.

Is Elasticsearch good for aggregation?

Elasticsearch Aggregations provide you with the ability to group and perform calculations and statistics (such as sums and averages) on your data by using a simple search query. An aggregation can be viewed as a working unit that builds analytical information across a set of documents.


1 Answers

Note: The original answer is now outdated in terms of the latest version of Elasticsearch. The equivalent script using Groovy scripting would be:

{
    ...,
    "aggregations" : {
        "grades_stats" : { 
            "stats" : { 
                "script" : "_score" 
            } 
        }
    }
}

In order to make this work, you will need to enable dynamic scripting or, even better, store a file-based script and execute it by name (for added security by not enabling dynamic scripting)!


You can use a script and refer to the score using doc.score. More details are available in ElasticSearch's scripting documentation.

A sample stats aggregation could be:

{
    ...,
    "aggregations" : {
        "grades_stats" : { 
            "stats" : { 
                "script" : "doc.score" 
            } 
        }
    }
}

And the results would look like:

"aggregations": {
    "grades_stats": {
        "count": 165,
        "min": 0.46667441725730896,
        "max": 3.1525731086730957,
        "avg": 0.8296855776598959,
        "sum": 136.89812031388283
    }
}

A histogram may also be a useful aggregation:

"aggs": {
    "grades_histogram": {
        "histogram": {
            "script": "doc.score * 10",
            "interval": 3
        }
    }
}

Histogram results:

"aggregations": {
    "grades_histogram": {
        "buckets": [
            {
               "key": 3,
               "doc_count": 15
            },
            {
               "key": 6,
               "doc_count": 103
            },
            {
               "key": 9,
               "doc_count": 46
            },
            {
               "key": 30,
               "doc_count": 1
            }
        ]
    }
}
like image 140
mas2df Avatar answered Sep 20 '22 06:09

mas2df