Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Insert aggregation results into an index

The goal is to build an Elasticsearch index with only the most recent documents in groups of related documents to track the current state of some monitoring counters and states.

I have crafted a simple Elasticsearch aggregation query:

{
  "size": 0,
  "aggs": {
    "group_by_monitor": {
      "terms": {
        "field": "monitor_name"
      },
      "aggs": {
        "get_latest": {
          "top_hits": {
            "size": 1,
            "sort": [
              {
                "timestamp": {
                  "order": "desc"
                }
              }
            ]
          }
        }
      }
    }
  }
}

It groups related documents into buckets and select the most recent document for each bucket.

Here are the different ideas I had to get the job done:

  1. directly use the aggregation query to push the results into the index, but it does not seem possible : Is it possible to put the results of an ElasticSearch aggregation back into the index?
  2. use the Logstash Elasticsearch input plugin to execute the aggregation query and the Elasticsearch output plugin to push into the index, but seems like the input plugin only looks at the hits field and is unable to handle aggregation results: Aggregation Query possible input ES plugin !
  3. use the Logstash http_poller plugin to get a JSON document, but it does not seem to allow specifying a body for the HTTP request !
  4. use the Logstash exec plugin to execute cURL commands to get the JSON but this seems quite cumbersome and my last resort.
  5. use the NEST API to build a basic application that will do polling, extract results, clean them and inject the resulting documents into the target index, but I'd like to avoid adding a new tool to maintain.

Is there a reasonably complex way of accomplishing this?

like image 274
Pragmateek Avatar asked Apr 08 '16 17:04

Pragmateek


1 Answers

Edit the logstash.conf file as follow

input {
  elasticsearch {
    hosts => "localhost" 
    index => "source_index_name" 
    type =>"index_type" 
    query => '{Query}' 
    size => 500 
    scroll => "5m" 
    docinfo => true
  }
}

output { 
  elasticsearch { 
    index => "target_index_name" 
    document_id => "%{[@metadata][_id]}"
  }
}
like image 103
Akshay Patil Avatar answered Jan 04 '23 07:01

Akshay Patil