Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

elasticsearch query aggs sorted max date

I have data such as this:

Id    GroupId     UpdateDate
1    1                2013-11-15T12:00:00
2    1                2013-11-20T12:00:00
3    2                2013-12-01T12:00:00
4    2                2013-13-01T12:00:00
5    2                2013-11-01T12:00:00
6    3                2013-10-01T12:00:00

How can i write a query to return the list filtered/grouped to the max UpdateDate foreach group? and the final list is sorted desc by UpdateDate.

I expect this output:

Id    GroupId     UpdateDate
4    2                2013-13-01T12:00:00
2    1                2013-11-20T12:00:00
6    3                2013-10-01T12:00:00

Thank You :)

like image 548
user3739131 Avatar asked Sep 30 '22 13:09

user3739131


1 Answers

Yes this is possible with elasticsearch but the data will be in JSON format that needs to be flatten in the format you show above. Here's how I did it using Marvel Sense

Bulk load data:

POST myindex/mytype/_bulk
{"index":{}}
{"id":1,"GroupId":1,"UpdateDate":"2013-11-15T12:00:00"}
{"index":{}}
{"id":2,"GroupId":1,"UpdateDate":"2013-11-20T12:00:00"}
{"index":{}}
{"id":3,"GroupId":2,"UpdateDate":"2013-12-01T12:00:00"}
{"index":{}}
{"id":4,"GroupId":2,"UpdateDate":"2013-12-01T12:00:00"}
{"index":{}}
{"id":5,"GroupId":2,"UpdateDate":"2013-11-01T12:00:00"}
{"index":{}}
{"id":6,"GroupId":3,"UpdateDate":"2013-10-01T12:00:00"}

GET max by group:

GET myindex/mytype/_search?search_type=count
{
  "aggs": {
    "NAME": {
      "terms": {
        "field": "GroupId"
      },
      "aggs": {
        "NAME": {
          "max": {
            "field": "UpdateDate"
          }
        }
     }
    }
  }
}

Output:

{
...
   "aggregations": {
      "NAME": {
         "buckets": [
            {
               "key": 2,
               "doc_count": 3,
               "NAME": {
                 "value": 1385899200000
              }
           },
            {
               "key": 1,
               "doc_count": 2,
               "NAME": {
                  "value": 1384948800000
               }
            },
            {
               "key": 3,
               "doc_count": 1,
               "NAME": {
                  "value": 1380628800000
               }
            }
         ]
      }
   }
...
}

The max date gets returned as Linux time which needs to be converted back to readable dateformat.

like image 84
codeBarer Avatar answered Oct 27 '22 11:10

codeBarer