How to get an Elasticsearch aggregation with multiple fields

Tags:

I'm attempting to find related tags to the one currently being viewed. Every document in our index is tagged. Each tag is formed of two parts - an ID and text name:

{     ...     meta: {         ...         tags: [             {                 id: 123,                 name: 'Biscuits'             },             {                 id: 456,                 name: 'Cakes'             },             {                 id: 789,                 name: 'Breads'             }         ]     } }

To fetch the related tags I am simply querying the documents and getting an aggregate of their tags:

{     "query": {         "bool": {             "must": [                 {                     "match": {                         "item.meta.tags.id": "123"                     }                 },                 {                     ...                 }             ]         }     },     "aggs": {         "baked_goods": {             "terms": {                 "field": "item.meta.tags.id",                 "min_doc_count": 2             }         }     } }

This works perfectly, I am getting the results I want. However, I require both the tag ID and name to do anything useful. I have explored how to accomplish this, the solutions seem to be:

Combine the fields when indexing
A script to munge together the fields
A nested aggregation

Option one and two are are not available to me so I have been going with 3 but it's not responding in an expected manner. Given the following query (still searching for documents also tagged with 'Biscuits'):

{     ...     "aggs": {         "baked_goods": {             "terms": {                 "field": "item.meta.tags.id",                 "min_doc_count": 2             },             "aggs": {                 "name": {                     "terms": {                         "field": "item.meta.tags.name"                     }                 }             }         }     } }

I will get this result:

{     ...     "aggregations": {         "baked_goods": {             "buckets": [                 {                     "key": "456",                     "doc_count": 11,                     "name": {                         "buckets": [                             {                                 "key": "Biscuits",                                 "doc_count": 11                             },                             {                                 "key": "Cakes",                                 "doc_count": 11                             }                         ]                     }                 }             ]         }     } }

The nested aggregation includes both the search term and the tag I'm after (returned in alphabetical order).

I have tried to mitigate this by adding an exclude to the nested aggregation but this slowed the query down far too much (around 100 times for 500000 docs). So far the fastest solution is to de-dupe the result manually.

What is the best way to get an aggregation of tags with both the tag ID and tag name in the response?

Thanks for making it this far!

303

asked Jun 09 '15 09:06

i_like_robots

1 Answers

By the looks of it, your tags is not nested. For this aggregation to work, you need it nested so that there is an association between an id and a name. Without nested the list of ids is just an array and the list of names is another array:

    "item": {       "properties": {         "meta": {           "properties": {             "tags": {               "type": "nested",           <-- nested field               "include_in_parent": true,  <-- to, also, keep the flat array-like structure               "properties": {                 "id": {                   "type": "integer"                 },                 "name": {                   "type": "string"                 }               }             }           }         }       }     }

Also, note that I've added to the mapping this line "include_in_parent": true which means that your nested tags will, also, behave like a "flat" array-like structure.

So, everything you had so far in your queries will still work without any changes to the queries.

But, for this particular query of yours, the aggregation needs to change to something like this:

{   "aggs": {     "baked_goods": {       "nested": {         "path": "item.meta.tags"       },       "aggs": {         "name": {           "terms": {             "field": "item.meta.tags.id"           },           "aggs": {             "name": {               "terms": {                 "field": "item.meta.tags.name"               }             }           }         }       }     }   } }

And the result is like this:

   "aggregations": {       "baked_goods": {          "doc_count": 9,          "name": {             "doc_count_error_upper_bound": 0,             "sum_other_doc_count": 0,             "buckets": [                {                   "key": 123,                   "doc_count": 3,                   "name": {                      "doc_count_error_upper_bound": 0,                      "sum_other_doc_count": 0,                      "buckets": [                         {                            "key": "biscuits",                            "doc_count": 3                         }                      ]                   }                },                {                   "key": 456,                   "doc_count": 2,                   "name": {                      "doc_count_error_upper_bound": 0,                      "sum_other_doc_count": 0,                      "buckets": [                         {                            "key": "cakes",                            "doc_count": 2                         }                      ]                   }                },                .....

166

answered Sep 29 '22 04:09

Andrei Stefan

Related questions
                            
                                Liquibase or Flyway database migration alternative for Elasticsearch
                            
                                How to update a field type in elasticsearch
                            
                                JSON Bulk import to Elasticstearch
                            
                                How to disable elasticsearch 5.0 authentication?
                            
                                ElasticSearch - Searching For Human Names
                            
                                How to upgrade a running Elasticsearch older instance to a newer version?
                            
                                Querystring search on array elements in Elastic Search
                            
                                Find documents with empty string value on elasticsearch
                            
                                kibana filter by absent substring
                            
                                Elasticsearch query on a specific index
                            
                                How to update multiple documents that match a query in elasticsearch
                            
                                Can't install sense plugin for Kibana
                            
                                How to config Single node for Single Cluster (Standalone Cluster) ElasticSearch
                            
                                ElasticSearch - high indexing throughput
                            
                                How to extend an existing docker image?
                            
                                PostgreSQL(Full Text Search) vs ElasticSearch
                            
                                What is the maximum Elasticsearch document size?
                            
                                why elasticsearch won't run on Ubuntu 14.04?
                            
                                How to undo setting Elasticsearch Index to readonly?
                            
                                Similar image search by pHash distance in Elasticsearch

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to get an Elasticsearch aggregation with multiple fields

Tags:

aggregate

elasticsearch

faceted-search

i_like_robots

People also ask

1 Answers

Andrei Stefan

Recent Activity

Donate For Us