Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch Date Histogram aggregation with specific time range

We are performing 3 level aggregation for a certain date range we require fetching the distinct "Website" name grouped by distinct "HitCount" vale grouped by "DateTime" intervals. Here, histogram aggregation allows us to fetch the interval based documents, however the "key_as_string" of date is always considered from 12 AM instead of the date range time provided in the query. Depending on the interval period value, the day (24 hrs starting from 12 AM of the from time) is divided and aggregation output is given.

For e.g. we have given the from time as "2015-11-10T11:00:00" and To time as "2015-11-13T11:00:00" with interval of 8 hrs

Following is the query used:

{
  "size": 0,
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "DateTime": {
                  "from": "2015-11-10T11:00:00",
                  "to": "2015-11-13T11:00:00"
                }
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "Website": {
      "terms": {
        "field": "Website",
        "size": 0,
        "order": {
          "_count": "desc"
        }
      },
      "aggs": {
        "HitCount": {
          "terms": {
            "field": "HitCount",
            "size": 0,
            "order": {
              "_count": "desc"
            }
          },
          "aggs": {
            "DateTime": {
              "date_histogram": {
                "field": "DateTime",
                "interval": "8h",
                "min_doc_count": 0,
                "extended_bounds": {
                  "min": 1447153200000,
                  "max": 1447412400000
                }
              }
            }
          }
        }
      }
    }
  }
}

The query Output wrt 3rd level DateTime aggregation is:

"DateTime": {
"buckets": [
{
"key_as_string": "2015-11-10T08:00:00.000Z",
"key": 1447142400000,
"doc_count": 62698
}
,
{
"key_as_string": "2015-11-10T16:00:00.000Z",
"key": 1447171200000,
"doc_count": 248118
}
,
{
"key_as_string": "2015-11-11T00:00:00.000Z",
"key": 1447200000000,
"doc_count": 224898
}
,
{
"key_as_string": "2015-11-11T08:00:00.000Z",
"key": 1447228800000,
"doc_count": 221663
}
,
{
"key_as_string": "2015-11-11T16:00:00.000Z",
"key": 1447257600000,
"doc_count": 220935
}
,
{
"key_as_string": "2015-11-12T00:00:00.000Z",
"key": 1447286400000,
"doc_count": 219340
}
,
{
"key_as_string": "2015-11-12T08:00:00.000Z",
"key": 1447315200000,
"doc_count": 218452
}
,
{
"key_as_string": "2015-11-12T16:00:00.000Z",
"key": 1447344000000,
"doc_count": 190
}
,
{
"key_as_string": "2015-11-13T00:00:00.000Z",
"key": 1447372800000,
"doc_count": 0
}
,
{
"key_as_string": "2015-11-13T08:00:00.000Z",
"key": 1447401600000,
"doc_count": 0
}
]
}


Expected Output:

Here, we would expect the intervals to be divided and queried as:
2015-11-10T11:00:00 to 2015-11-10T19:00:00
2015-11-10T19:00:00 to 2015-11-11T03:00:00
2015-11-11T03:00:00 to 2015-11-11T11:00:00
2015-11-11T11:00:00 to 2015-11-11T19:00:00
2015-11-11T19:00:00 to 2015-11-12T03:00:00
2015-11-12T03:00:00 to 2015-11-12T11:00:00
2015-11-12T11:00:00 to 2015-11-12T19:00:00
2015-11-12T19:00:00 to 2015-11-13T03:00:00
2015-11-13T03:00:00 to 2015-11-13T11:00:00


ie. the "key_as_string" output value should be 2015-11-10T11:00:00, 2015-11-10T19:00:00, .... and so on

The above is required as we have given a From & to time of 11 AM so that it can be a updated value of every 8 hrs whenever we fire the query rather than getting a fixed range of time for the whole day.

Note: ES 1.7 is used
like image 362
Prakash Ghanshani Avatar asked Nov 16 '15 14:11

Prakash Ghanshani


1 Answers

The documentation explains that you can use the offset parameter.

So

{
  "size": 0,
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "DateTime": {
                  "from": "2015-11-10T11:00:00",
                  "to": "2015-11-13T11:00:00"
                }
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "Website": {
      "terms": {
        "field": "Website",
        "size": 0,
        "order": {
          "_count": "desc"
        }
      },
      "aggs": {
        "HitCount": {
          "terms": {
            "field": "HitCount",
            "size": 0,
            "order": {
              "_count": "desc"
            }
          },
          "aggs": {
            "DateTime": {
              "date_histogram": {
                "field": "DateTime",
                "interval": "8h",
                "min_doc_count": 0,
                "offset": "+11h"
              }
            }
          }
        }
      }
    }
  }
}
like image 117
Vanlightly Avatar answered Oct 04 '22 10:10

Vanlightly