How can I sort the output from an aggregation by a field that is in the source data, but not part of the output of the aggregation?
In my source data I have a date field that I would like the output of the aggregation to be sorted by date.
Is that possible? I've looked at using "order" within the aggregation, but I don't think it can see that date field to use it for sorting?
I've also tried adding a sub aggregation which includes the date field, but again, I cannot get it to sort on this field.
I'm calculating a hash for each document in my ETL on the way in to elastic. My data set contains a lot of duplication, so I'm trying to use the aggregation on the hash field to filter out duplicates and that works fine. I need the output from the aggregation to retain a date sort order so that I can work with the output in angular.
The documents are like this:
{_id: 123,
_source: {
"hash": "01010101010101"
"user": "1"
"dateTime" : "2001/2/20 09:12:21"
"action": "Login"
}
{_id: 124,
_source: {
"hash": "01010101010101"
"user": "1"
"dateTime" : "2001/2/20 09:12:21"
"action": "Login"
}
{_id: 132,
_source: {
"hash": "0202020202020"
"user": "1"
"dateTime" : "2001/2/20 09:20:43"
"action": "Logout"
}
{_id: 200,
_source: {
"hash": "0303030303030303"
"user": "2"
"dateTime" : "2001/2/22 09:32:14"
"action": "Login"
}
So I want to use an aggregation on the hash value to remove duplicates from my set and then render the response in date order.
My query:
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"action": "Login"
}
}
]
},
"size": 0,
"aggs": {
"md5": {
"terms": {
"field": "hash",
"size": 0
}
},
"size": 0,
"aggs": {
"byDate": {
"terms": {
"field": "dateTime",
"size": 0
}
}
}
}
}
}
}
}
Currently the output is ordered on the hash and I need it ordered on the date field within each hash bucket. Is that possible?
If the aggregation on "hash" is just for removing duplicates, it might work for you to simply aggregate on "dateTime" first, followed by the terms
aggregation on "hash". For example:
GET my_index/test/_search
{
"query" : {
"filtered" : {
"filter" : {
"bool": {
"must" : [
{ "term": {"action":"Login"} }
]
}
}
}
},
"size": 0,
"aggs": {
"byDate" : {
"terms": {
"field" : "dateTime",
"order": { "_term": "asc" } <---- EDIT: must specify order here
},
"aggs": {
"byHash": {
"terms": {
"field": "hash"
}
}
}
}
}
}
This way, your results would be sorted by "dateTime" first.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With