Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

elasticsearch - Aggregation returns terms in key , but not the complete field, how can I get full field returned?

In the elasticsearch implementation , I have few simple aggregations on the basis of few fields as shown below -

 "aggs" : {     "author" : {         "terms" : { "field" : "author"            , "size": 20,           "order" : { "_term" : "asc" }         }     },     "title" : {         "terms" : { "field" : "title"            , "size": 20         }     },     "contentType" : {         "terms" : { "field" : "docType"            , "size": 20         }     } } 

The aggregations work fine and I get the results accordingly. but the title key field returned (or any other field - multi word) , has single word aggregation and results. I need the full title in the returned result, rather then just a word- which doesn't make much sense. how can I get that.

Current results (just a snippet) -

"title": {      "buckets": [         {            "key": "test",            "doc_count": 1716         },         {            "key": "pptx",            "doc_count": 1247         },         {            "key": "and",            "doc_count": 661         },         {            "key": "for",            "doc_count": 489         },         {            "key": "mobile",            "doc_count": 487         },         {            "key": "docx",            "doc_count": 486         },         {            "key": "pdf",            "doc_count": 450         },         {            "key": "2012",            "doc_count": 397         } ] } 

expected results -

"title": {          "buckets": [             {                "key": "test document for stack overflow ",                "doc_count": 1716             },             {                "key": "this is a pptx",                "doc_count": 1247             },             {                "key": "its another document and so on",                "doc_count": 661             },             {                "key": "for",                "doc_count": 489             },             {                "key": "mobile",                "doc_count": 487             },             {                "key": "docx",                "doc_count": 486             },             {                "key": "pdf",                "doc_count": 450             },             {                "key": "2012",                "doc_count": 397             } } 

I went through a lot of documentation, it explains different ways to aggregate results, but I couldn't find how to get the full text if a field in key in result , please advise how can I achieve this?

like image 348
dev123 Avatar asked Jul 08 '14 19:07

dev123


People also ask

What is Bucket aggregation in Elasticsearch?

Bucket aggregations don't calculate metrics over fields like the metrics aggregations do, but instead, they create buckets of documents. Each bucket is associated with a criterion (depending on the aggregation type) which determines whether or not a document in the current context "falls" into it.

Is Elasticsearch good for aggregation?

Elasticsearch Aggregations provide you with the ability to group and perform calculations and statistics (such as sums and averages) on your data by using a simple search query. An aggregation can be viewed as a working unit that builds analytical information across a set of documents.

How can we perform maths calculation done on the documents present in the bucket?

Metric Aggregation. Metric Aggregation mainly refers to the maths calculation done on the documents present in the bucket. For example if you choose a number field the metric calculation you can do on it is COUNT, SUM, MIN, MAX, AVERAGE etc.


1 Answers

You need to have untokenized copies of the terms in the index, in your mapping use multi-fields:

{     "test": {         "mappings": {             "book": {                 "properties": {                                     "author": {                         "type": "string",                         "fields": {                             "untouched": {                                 "type": "string",                                 "index": "not_analyzed"                             }                         }                     },                     "title": {                         "type": "string",                         "fields": {                             "untouched": {                                 "type": "string",                                 "index": "not_analyzed"                             }                         }                     },                     "docType": {                         "type": "string",                         "fields": {                             "untouched": {                                 "type": "string",                                 "index": "not_analyzed"                             }                         }                     }                 }             }         }     } } 

In your aggregation query reference the untokenized fields:

"aggs" : {     "author" : {          "terms" : {              "field" : "author.untouched",              "size": 20,             "order" : { "_term" : "asc" }         }      },     "title" : {         "terms" : {            "field" : "title.untouched",            "size": 20         }     },     "contentType" : {         "terms" : {             "field" : "docType.untouched",             "size": 20         }     } } 
like image 144
Dan Tuffery Avatar answered Sep 20 '22 12:09

Dan Tuffery