I have an SNS topic that is published to whenever an SES email bounces. I have a CloudWatch alarm set up to trigger when a threshold of notifications is crossed over the past hour.
In practice, bounces are rare, and because SNS notifications are only sent when an email bounces, the alarm spends almost its entire time in the INSUFFICIENT_DATA
state.
Ideally, I'd like for the lack of SNS notifications to be treated as a zero value. In other monitoring systems (like graphite/grafana) this is consider "null as zero."
Is there any way to treat the (lack of) notifications this way, and keep the alarm out of the insufficient data state?
Because the data points are not successfully being delivered to CloudWatch, the alarm can't retrieve any data points for those evaluation periods. This triggers an INSUFFICIENT_DATA state. After recovering connectivity, the application sends the backlog of data points, each one with its own timestamp.
The following code example: Creates and enables a new CloudWatch alarm (or updates an existing alarm, if an alarm with the specified name already exists). Disables the new or existing alarm.
To determine why you're not receiving SNS notifications, check the history of the CloudWatch alarm to find the status of the trigger action. SNS restricts the sources that can publish messages to the topic using access policies.
Amazon CloudWatch uses Amazon SNS to send email. First, create and subscribe to an SNS topic. When you create a CloudWatch alarm, you can add this SNS topic to send an email notification when the alarm changes state.
Amazon SNS does not send metric data to CloudWatch when the value is zero. This results in INSUFFICIENT_DATA
for Alarms where no emails are send. However, your alarm should work as desired with no change.
The INSUFFICIENT_DATA
message results from two situations:
If there is at least one data point within the past hour, and the alarm has existed for at least an hour, then the state will be either OK
or ALARM
.
Therefore, you should treat INSUFFICIENT_DATA
the same as OK
. (It is even possible to trigger alarms based on entering the INSUFFICIENT DATA
state!)
Also, in case you're not already, be sure to use SUM
rather than AVERAGE
since your use-case involves looking at the count of messages during a period. My tests show that a SUM
alarm triggers immediately, whereas AVERAGE
requires more time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With