Assuming I have a date field on a document, I know using the date_histogram aggregation I can get a document count by day, month, year, etc.
What I want to do is get the average document count for January, February, March, etc. over several given years. The same goes for Monday, Tuesday, Wednesday, etc. over several given weeks. Is there a way to do this having just that same date field or what is the best way to accomplish this with Elasticsearch?
Example
Let's say we have a bunch of orders placed over three years:
What I want is the average of each month over the given years, so the output would be:
Jan (10 + 13 + 10 / 3 = 11 orders), Feb (6.33 orders), Mar (8.33 orders), Apr (13 orders), etc.
It would be best if this can be generalized for N years (or N Januaries, etc.) so that we search over any date range.
You can use 'monthOfYear' like this:
"aggregations": {
"timeslice": {
"histogram": {
"script": "doc['timestamp'].date.getMonthOfYear()",
"interval": 1,
"min_doc_count": 0,
"extended_bounds": {
"min": 1,
"max": 12
},
"order": {
"_key": "desc"
}
}
}
The extended bounds will ensure you get a value for every month (even if it is zero).
If you want the month names, you can either do that in your own code, or, do this (at the consequence that you won't get values for months that have no data):
"aggregations": {
"monthOfYear": {
"terms": {
"script": "doc['timestamp'].date.monthOfYear().getAsText()",
"order": {
"_term": "asc"
}
}
}
Once you've got this, you can nest your stats aggregation inside this one:
"aggregations: {
"monthOfYear": {
"terms": {
...
},
"aggregations": {
"stats": ...
}
}
}
The question is pretty old now, but, hope this helps someone.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With