Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregating with multiple fields returned in ElasticSearch

Suppose I have a relative simple index with the following fields...

"testdata": {
    "properties": {
       "code": {
          "type": "integer"
       },
       "name": {
          "type": "string"
       },
       "year": {
          "type": "integer"
       },
       "value": {
          "type": "integer"
       }
    }
}

I can write a query to get the total sum of the values aggregated by the code like so:

{
  "from":0,
  "size":0,
  "aggs": {
    "by_code": {
      "terms": {
        "field": "code"
      },
      "aggs": {
        "total_value": {
          "sum": {
            "field": "value"
          }
        }
      }
    }
  }
}

And this returns the following (abridged) results:

"aggregations": {
  "by_code": {
     "doc_count_error_upper_bound": 478,
     "sum_other_doc_count": 328116,
     "buckets": [
        {
           "key": 236948,
           "doc_count": 739,
           "total_value": {
              "value": 12537
           }
        },

However, this data is being fed to a web front-end, where it is required both the code and the name is displayed. So, the question is, is it possible to amend the query somehow to also return the name field, as well as the code field, in the results?

So, for example, the results can look a bit like this:

"aggregations": {
  "by_code": {
     "doc_count_error_upper_bound": 478,
     "sum_other_doc_count": 328116,
     "buckets": [
        {
           "key": 236948,
           "code": 236948,
           "name": "Test Name",
           "doc_count": 739,
           "total_value": {
              "value": 12537
           }
        },

I've read up on sub-aggregations, but in this case there is a one-to-one relationship between code and name (so, you wouldn't have different names for the same key). Also, in my real case, there are 5 other fields, like description, that I would like to return, so I am wondering if there was another way to do it.

In SQL (from which this data originally came from before it was swapped to ElasticSearch) I would write the following query

SELECT Code, Name, SUM(Value) AS Total_Value
FROM [TestData] 
GROUP BY Code, Name
like image 590
Tim C Avatar asked Oct 11 '16 12:10

Tim C


Video Answer


1 Answers

You can achieve this using scripting, i.e. instead of specifying a field, you specify a combination of fields:

{
  "from":0,
  "size":0,
  "aggs": {
    "by_code": {
      "terms": {
        "script": "[doc.code.value, doc.name.value].join('-')"
      },
      "aggs": {
        "total_value": {
          "sum": {
            "field": "value"
          }
        }
      }
    }
  }
}

note: you need to make sure to enable dynamic scripting for this to work

like image 80
Val Avatar answered Nov 15 '22 10:11

Val