I'm currently attempting to generate graphs from data collected in Elastic Search. Every time an user is generated, I insert a record in ES, having the following (example) data:
{
"country": "US",
"id": "79ca9523dcd62420030de12b75e08bb7",
"createdAt": "1450912898"
}
ID is a hash of the User ID, so that the User ID can not be determined from the ID stored in ES, for privacy reasons.
The type mapping in the ES index is as following:
{
"user": {
"_timestamp": {
"enabled": true
},
"properties": {
"country": {
"type": "string"
},
"createdAt": {
"type": "date",
"format": "epoch_second"
},
"id": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
Now, to get a graph of users per day, I have the following query:
{
"size": 0,
"query": {
"type": {
"value": "user"
}
},
"aggs": {
"users_per_day": {
"date_histogram": {
"field": "createdAt",
"interval": "day"
}
}
}
}
This gives me a nice result such as this (for the result I set the interval to minute to give you a slight idea of what the problem is):
[{
"key_as_string": "1450909920",
"key": 1450909920000,
"doc_count": 8
},
{
"key_as_string": "1450909980",
"key": 1450909980000,
"doc_count": 2
},
{
"key_as_string": "1450910040",
"key": 1450910040000,
"doc_count": 5
},
{
"key_as_string": "1450910100",
"key": 1450910100000,
"doc_count": 8
},
{
"key_as_string": "1450910160",
"key": 1450910160000,
"doc_count": 4
},
{
"key_as_string": "1450910220",
"key": 1450910220000,
"doc_count": 3
},
{
"key_as_string": "1450910280",
"key": 1450910280000,
"doc_count": 6
}]
I'd like to use the doc_count
to generate a cumulative graph, so that I can see the growth of my userbase, instead of the amount of accounts made per day. Despite searching on the internet, I can not find a single answer that seems to be related to my question. Most answers I find are directing me to the Cumulative Sum Aggregation page, but the example given there will give you a cumulative sum of all results captured in a single bucket. I'd like a cumulative sum of all bucket totals.
You're on the right path with the cumulative sum aggregation and you can definitely use it. You just need to use the special _count
bucket path and that will do the job you expect.
{
"size": 0,
"query": {
"type": {
"value": "user"
}
},
"aggs": {
"users_per_day": {
"date_histogram": {
"field": "createdAt",
"interval": "day"
},
"aggs": {
"cumulative": {
"cumulative_sum": {
"buckets_path": "_count"
}
}
}
}
}
}
The results will look like this:
[{
"key_as_string": "1450909920",
"key": 1450909920000,
"doc_count": 8,
"cumulative": {"value": 8}
},
{
"key_as_string": "1450909980",
"key": 1450909980000,
"doc_count": 2,
"cumulative": {"value": 10}
},
{
"key_as_string": "1450910040",
"key": 1450910040000,
"doc_count": 5,
"cumulative": {"value": 15}
},
{
"key_as_string": "1450910100",
"key": 1450910100000,
"doc_count": 8,
"cumulative": {"value": 23}
},
{
"key_as_string": "1450910160",
"key": 1450910160000,
"doc_count": 4,
"cumulative": {"value": 27}
},
{
"key_as_string": "1450910220",
"key": 1450910220000,
"doc_count": 3,
"cumulative": {"value": 30}
},
{
"key_as_string": "1450910280",
"key": 1450910280000,
"doc_count": 6,
"cumulative": {"value": 36}
}]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With