Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch aggregation on object

How do I can run an aggregation query only on object property, but get all properties in result? e.g. I want to get [{'doc_count': 1, 'key': {'id': 1, 'name': 'tag name'}}], but got [{'doc_count': 1, 'key': '1'] instead. Aggregation on field 'tags' returns zero results.

Mapping:

{
  "test": {
    "properties" : {
      "tags" : {
        "type" : "object",
        "properties": {
          "id" : {"type": "string", "index": "not_analyzed"},
          "name" : {"type": "string", "index": "not_analyzed", "enabled": false}
        }
      }
    }
  }
}

Aggregation query: (returns only IDs as expected, but how can I get ID & name pairs in results?)

'aggregations': {
  'tags': {
    'terms': {
      'field': 'tags.id',
      'order': {'_count': 'desc'},
    },
  }
}

EDIT: Got ID & Name by aggregating on "script": "_source.tags" but still looking for faster solution.

like image 512
Dmytro Sadovnychyi Avatar asked May 01 '14 10:05

Dmytro Sadovnychyi


People also ask

Is Elasticsearch good for aggregation?

Elasticsearch Aggregations provide you with the ability to group and perform calculations and statistics (such as sums and averages) on your data by using a simple search query. An aggregation can be viewed as a working unit that builds analytical information across a set of documents.

What is nested aggregation?

Nested aggregationeditA special single bucket aggregation that enables aggregating nested documents. For example, lets say we have an index of products, and each product holds the list of resellers - each having its own price for the product.

What is sub aggregation in Elasticsearch?

The sub-aggregations will be computed for the buckets which their parent aggregation generates. There is no hard limit on the level/depth of nested aggregations (one can nest an aggregation under a "parent" aggregation, which is itself a sub-aggregation of another higher-level aggregation).

What is Sum_other_doc_count?

sum_other_doc_count is the number of documents that didn't make it into the the top size terms.


2 Answers

It's also possible to nest aggregation, you could aggregate by id, then by name.

like image 92
Thomas Decaux Avatar answered Oct 13 '22 18:10

Thomas Decaux


you can use a script if you want, e.g.

"terms":{"script":"doc['tags.id'].value + '|' + doc['tags.name'].value"}

for each created bucket you will get a key with the values of the fields that you have included in your script. To be honest though, the purpose of aggregations is not to return full docs back, but to do calculations on groups of documents (buckets) and return the results, e.g. sums and distinct values. What you actually doing with your query is that you create buckets based on the field tags.id.

Keep in mind that the key on the result will include both values separated with a '|' so you might have to manipulate its value to extract all the information that you need.

like image 25
cpard Avatar answered Oct 13 '22 17:10

cpard