Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get an alarm when there are no logs for a time period in AWS Cloudwatch?

I have a Java application that runs in AWS Elastic Container Service. Application polls a queue periodically. Sometimes there is no response from the queue and the application hanging forever. I have enclosed the methods with try-catch blocks with logging exceptions. Even though there are no logs in the Cloudwatch after that. No exceptions or errors. Is there a way that I can identify this situation. ? (No logs in the Cloudwatch). Like filtering an error log pattern. So I can restart the service. Any trick or solution would be appreciated.

public void handleProcess() {
    try {
        while(true) {
            Response response = QueueUitils.pollQueue(); // poll the queue
            QueueUitils.processMessage(response);
            TimeUnit.SECONDS.sleep(WAIT_TIME); // WAIT_TIME = 20
        }
    } catch (Exception e) {
        LOGGER.error("Data Queue operation failed" + e.getMessage());
        throw e;
    }
}
like image 307
Ashan Tharindu Avatar asked Sep 18 '20 14:09

Ashan Tharindu


People also ask

How do I trigger AWS CloudWatch alarm?

Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/ . In the navigation pane, choose Instances. Select the instance and choose Actions, Monitor and troubleshoot, Manage CloudWatch alarms. On the Manage CloudWatch alarms detail page, under Add or edit alarm, select Create an alarm.

How do I check my CloudWatch alarm history?

To view your alarm history, log in to CloudWatch in the Amazon Web Services Management Console, choose Alarms from the menu at left, select your alarm, and click the History tab in the lower panel. There you will find a history of any state changes to the alarm as well as any modifications to the alarm configuration.

How alarm state is evaluated when data is missing?

How alarm state is evaluated when data is missing. Whenever an alarm evaluates whether to change state, CloudWatch attempts to retrieve a higher number of data points than the number specified as Evaluation Periods.

How do I trigger alerts in AWS?

Login to the AWS Management Console and choose AWS Chatbot. Choose Configure new client notice option to select either Chime or Slack. Select Slack and choose Configure. You will be asked to Sign in to your workspace, if you haven't already.

What is alarm in AWS CloudWatch?

AWS::CloudWatch::Alarm. The AWS::CloudWatch::Alarm type specifies an alarm and associates it with the specified metric or metric math expression. When this operation creates an alarm, the alarm state is immediately set to INSUFFICIENT_DATA. The alarm is then evaluated and its state is set appropriately.

What happens if CloudWatch has too many consecutive periods breached?

If you omit this parameter, CloudWatch uses the same value here that you set for EvaluationPeriods, and the alarm goes to alarm state if that many consecutive periods are breaching. The dimensions for the metric associated with the alarm.

Why is the alarm in insufficient_data state in CloudWatch?

An alarm in INSUFFICIENT_DATA state might simply reflect the normal behavior of a metric. There are two types of metrics based on how they are pushed to CloudWatch: period-driven and event-driven. Some services send periodic data points to their metrics, but specific metrics might have periods without data points.

Can I create alarms that watch metrics in other AWS accounts?

An alarm can watch a metric in the same account. If you have enabled cross-account functionality in your CloudWatch console, you can also create alarms that watch metrics in other AWS accounts. Creating cross-account composite alarms is not supported.


2 Answers

You can do this with CloudWatch Alarms. I've set up a test Lambda function for this which runs every minute and logs to CloudWatch.

  1. Go to CloudWatch and Click Alarms in the left hand side menu
  2. Click the orange Create Alarm button Create Alarm
  3. Click Select Metric Select Metric
  4. Then choose Logs, then Log Group Metrics and choose the IncomingLogEvents metric for the relevant log group (the log group to which your application is logging). In my case it's /aws/lambda/test-log-silence Select Log Group Metric
  5. Click Select Metric
  6. Now you can specify how you want to measure the metric. I've chosen the average log entries over 5 minutes, so after 5 minutes if there are no log entries, that value would be zero. Specify Metric Measurements
  7. Scroll down, and you set the check to be "Lower Than or Equal To" zero. This will trigger the alarm when there are no log entries for 5 minutes (or whatever you decide to set it to). Specify Conditions
  8. Now click next, and you can specify an SNS topic to push the notification to. You can set up an SNS topic to notify you via email, SMS, AWS Lambda, and others.
like image 76
brads3290 Avatar answered Sep 28 '22 00:09

brads3290


With reference to brads3290's answer, if you are using AWS CDK:

import * as cloudwatch from '@aws-cdk/aws-cloudwatch'; 
// ...
const metric = new cloudwatch.Metric({
      namespace: 'AWS/Logs',
      metricName: 'IncomingLogEvents',
      dimensions: { LogGroupName: '/aws/lambda/test-log-silence' },
      statistic: "Average",
      period: cdk.Duration.minutes(5),
    });

const alarm = new cloudwatch.Alarm(this, 'Alarm', {
      metric,
      threshold: 0,
      comparisonOperator: cloudwatch.ComparisonOperator.LESS_THAN_OR_EQUAL_TO_THRESHOLD,
      evaluationPeriods: 1,
      datapointsToAlarm: 1,
      treatMissingData: cloudwatch.TreatMissingData.BREACHING,
    });

This should also solve the problem of ignoring missing data.

like image 40
sompnd Avatar answered Sep 28 '22 02:09

sompnd