Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch average over date histogram buckets

I've got a bunch of documents indexed in ElasticSearch, and I need to get the following data:

For each month, get the average number of documents per working day of the month (or if impossible, use 20 days as the default).

I already aggregated my data into months buckets using the date histogram aggregation. I tried to nest a stats bucket, but this aggregations uses data extracted from the document's field, not from the parent bucket.

Here is my query so far:

{
    "query": {
        "match_all": {}
    },
    "aggs": {
        "docs_per_month": {
            "date_histogram": {
                "field": "created_date",
                "interval": "month",
                "min_doc_count": 0
            }
            "aggs": {
                '???': '???'
            }
        }
    }
}

edit

To make my question clearer, what I need is:

  • Get the total of numbers of documents created for the month (which is already done thanks to the date_histogram aggregation)
  • Get the number of working days for the month
  • Divide the first by the second.

like image 884
Thibault J Avatar asked Jun 11 '15 08:06

Thibault J


1 Answers

For anyone still interested, you can now do with with the avg_bucket aggregation. Its still a bit tricky, because you cannot simply run the avg_bucket on a date_historgram aggregation result, but with a secondary value_count aggregation with some unique value and it works fine :)

{
  "size": 0,
  "aggs": {
    "orders_per_day": {
      "date_histogram": {
        "field": "orderedDate",
        "interval": "day"
      },
      "aggs": {
        "amount": {
          "value_count": {
            "field": "dateCreated"
          }
        }
      }
    },
    "avg_daily_order": {
      "avg_bucket": {
        "buckets_path": "orders_per_day>amount"
      }
    }
  }
}
like image 80
dularion Avatar answered Sep 21 '22 19:09

dularion