Indexed documents are like: <pre class="prettyprint"><code>{ id: 1, title: 'Blah', ... platform: {id: 84, url: 'http://facebook.com', title: 'Facebook'} ... } </code></pre> What I want is count and output stats-by-platform. For counting, I can use terms aggregation with <code>platform.id</code> as a field to count: <pre class="prettyprint"><code>aggs: { platforms: { terms: {field: 'platform.id'} } } </code></pre> This way I receive stats as a multiple buckets looking like <code>{key: 8, doc_count: 162511}</code>, as expected. Now, can I somehow add to those buckets also <code>platform.name</code> and <code>platform.url</code> (for pretty output of stats)? The best I've came with looks like: <pre class="prettyprint"><code>aggs: { platforms: { terms: {field: 'platform.id'}, aggs: { name: {terms: {field: 'platform.name'}}, url: {terms: {field: 'platform.url'}} } } } </code></pre> Which, in fact, works, and returns pretty complicated structure in each bucket: <pre class="prettyprint"><code>{key: 7, doc_count: 528568, url: {doc_count_error_upper_bound: 0, sum_other_doc_count: 0, buckets: [{key: "http://facebook.com", doc_count: 528568}]}, name: {doc_count_error_upper_bound: 0, sum_other_doc_count: 0, buckets: [{key: "Facebook", doc_count: 528568}]}}, </code></pre> Of course, name and url of platform could be extracted from this structure (like <code>bucket.url.buckets.first.key</code>), but is there more clean and simple way to do the task?

It seems the best way to show intentions is top hits aggregation: "from each aggregated group select only one document", and then extract platform from it: <pre class="prettyprint"><code>aggs: { platforms: { terms: {field: 'platform.id'}, aggs: { platform: {top_hits: {size: 1, _source: {include: ['platform']}}} } } </code></pre> This way, each bucked will look like: <pre class="prettyprint"><code>{"key": 7, "doc_count": 529939, "platform": { "hits": { "hits": [{ "_source": { "platform": {"id": 7, "name": "Facebook", "url": "http://facebook.com"} } }] } }, } </code></pre> Which is kinda too deeep (as usual with ES), but clean: <code>bucket.platform.hits.hits.first._source.platform</code>

Adding additional fields to ElasticSearch terms aggregation

Tags:

elasticsearch

Indexed documents are like:

{   id: 1,    title: 'Blah',   ...   platform: {id: 84, url: 'http://facebook.com', title: 'Facebook'}   ... }

What I want is count and output stats-by-platform. For counting, I can use terms aggregation with platform.id as a field to count:

aggs: {   platforms: {     terms: {field: 'platform.id'}   } }

This way I receive stats as a multiple buckets looking like {key: 8, doc_count: 162511}, as expected.

Now, can I somehow add to those buckets also platform.name and platform.url (for pretty output of stats)? The best I've came with looks like:

aggs: {   platforms: {     terms: {field: 'platform.id'},     aggs: {       name: {terms: {field: 'platform.name'}},       url: {terms: {field: 'platform.url'}}     }   } }

Which, in fact, works, and returns pretty complicated structure in each bucket:

{key: 7,   doc_count: 528568,   url:    {doc_count_error_upper_bound: 0,     sum_other_doc_count: 0,     buckets: [{key: "http://facebook.com", doc_count: 528568}]},   name:    {doc_count_error_upper_bound: 0,     sum_other_doc_count: 0,     buckets: [{key: "Facebook", doc_count: 528568}]}},

Of course, name and url of platform could be extracted from this structure (like bucket.url.buckets.first.key), but is there more clean and simple way to do the task?

760

asked Oct 23 '15 12:10

zverok

1 Answers

It seems the best way to show intentions is top hits aggregation: "from each aggregated group select only one document", and then extract platform from it:

aggs: {   platforms: {     terms: {field: 'platform.id'},     aggs: {       platform: {top_hits: {size: 1, _source: {include: ['platform']}}}   } }

This way, each bucked will look like:

{"key": 7,   "doc_count": 529939,   "platform": {     "hits": {       "hits": [{        "_source": {         "platform":            {"id": 7, "name": "Facebook", "url": "http://facebook.com"}         }       }]     }   }, }

Which is kinda too deeep (as usual with ES), but clean: bucket.platform.hits.hits.first._source.platform

answered Sep 19 '22 23:09

zverok

Related questions
                            
                                Elasticsearch 6: Rejecting mapping update as the final mapping would have more than 1 type
                            
                                List all fields in an elasticsearch index?
                            
                                Configure port number of ElasticSearch
                            
                                Elasticsearch URI based query with AND operator
                            
                                Multi-"match-phrase" query in Elastic Search
                            
                                Timestamp not appearing in Kibana
                            
                                elasticsearch / kibana errors "Data too large, data for [@timestamp] would be larger than limit
                            
                                what does _doc represents in elasticsearch?
                            
                                ElasticSearch not returning results for terms query against string property
                            
                                Filename search with ElasticSearch
                            
                                How to limit query time in elasticsearch?
                            
                                Multi tenancy in Elastic Search
                            
                                Elasticsearch GET request with request body
                            
                                How to add a new node to my Elasticsearch cluster
                            
                                Aggregation + sorting +pagination in elastic search
                            
                                How to fix ElasticSearch conflicts on the same key when two process writing at the same time
                            
                                How to do an ElasticSearch Select Distinct
                            
                                How to query elasticsearch for greater than and less than?
                            
                                elasticsearch: create index with mappings using javascript
                            
                                Fluentd vs Kafka

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Adding additional fields to ElasticSearch terms aggregation

Tags:

elasticsearch

zverok

People also ask

1 Answers

zverok

Recent Activity

Donate For Us