Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to delete older logs in ELK to give each application a certain disk quota

I am trying to use the ELK (Elasticsearch+Logstash+Kibana) stack in the following scenario:

I have about ten applications that send their logs, through Logstash, to a single Elasticsearch cluster.

Some of these applications naturally generate more logs than others, and, sometimes, one of them can go 'crazy', because of a bug, for instance, and, thus, generate even more log entries than it normally does. As a result, the disk space available in the cluster can be unfairly 'taken' by the logs of a single application, leaving not enough room to others.

I am currently managing the available disk space through Elasticsearch Curator. It runs periodically, as it is in the crontab, and deletes older indices based on a disk usage quota. When the disk space used by all indices exceeds a certain limit, the oldest indices are deleted, one by one, until the sum of the disk space used by them all is within the specified limit again.

The first problem with this approach is that Elasticsearch Curator can only delete entire indices. Hence, I had to configure Logstash to create one different index per hour, and increase their granularity; thus, Curator deletes smaller chunks of logs at a time. In addition, it is very difficult to decide how often Curator should run. If applications are generating logs at a higher rate, not even one-hour indices may be enough. Secondly, there is no way to specify a disk usage quota for each different application.

Ideally, Elasticsearch should be able to delete older log entries by itself whenever the indices reach a certain disk usage limit. This would eliminate the problem of defining how often Curator should run. However, I could not find any similar feature in the Elasticsearch manual.

Would anybody recommend a different approach to address these issues?

References: http://www.elasticsearch.org https://github.com/elasticsearch/curator

like image 390
Gabriel C Avatar asked Feb 15 '15 18:02

Gabriel C


People also ask

How do I delete Elasticsearch data?

You use DELETE to remove a document from an index. You must specify the index name and document ID. You cannot send deletion requests directly to a data stream. To delete a document in a data stream, you must target the backing index containing the document.


2 Answers

Try using index lifecycle management, which is available in ELK stack 6.6 newer version.

Please check this link:
https://www.elastic.co/guide/en/elasticsearch/reference/6.6/getting-started-index-lifecycle-management.html

This will create new index when size goes beyond 2GB or 1d, and it will delete 1day back data.

PUT _ilm/policy/stream_policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "2GB" ,   
            "max_age": "1d"
          }
        }
      },
      "delete": {
        "min_age": "1d",
        "actions": {
          "delete": {} 
        }
      }
    }
  }
}
like image 125
Angel H Avatar answered Sep 29 '22 12:09

Angel H


This is how you can delete your old logs ( filebeat logs in this example )

curl -XDELETE 'localhost:9200/filebeat-2016*?pretty'
like image 27
Abhishek Goel Avatar answered Sep 29 '22 10:09

Abhishek Goel