Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Amazon EC2 ELB alarm - which instance is unhealthy?

We have hosted some apps on Amazon EC2 and are using an Elastic Load Balancer (ELB) to manage several instances of one app. Also, we have set up ELB alarms to get notified about Unhealthy Hosts, i.e. when an instance has gone down.

So far, I could not figure out where to check which instance exactly has gone down when the alarm goes off, except for the ELB status page in the AWS console. However, if the instance comes back to In Service state again, this won't help me either. The e-mail notification sent out by the ELB does not contain this information; and I couldn't find it anywhere in the alarms history in the console either.

Is there a way to tell which instance an ELB alarm has been triggered for, even if the instance has come back into OK state in the meantime?

Cheers, Alex

like image 408
alexander.biskop Avatar asked Jan 30 '14 10:01

alexander.biskop


2 Answers

We are using the following Lambda function to make up for the lack of Health Check logging:

'use strict';

var AWS = require('aws-sdk');
var elb = new AWS.ELB();

exports.handler = (event, context, callback) => {

    var params = {
        LoadBalancerName: "<elb_name_here>"
    };
    elb.describeInstanceHealth(params, function(err, data) {
        if (err) console.log(err, err.stack); // an error occurred
        else     console.log(data);           // successful response
    });
};

It does not produce the prettiest logs in CloudWatch, but the data is there. It allows us to see if there is a particular instance which tends to drop more often, etc. It is set up much like Gerardo Grignoli's answer above. I added a CloudWatch alarm to send an SNS message to the Lambda function when the alarm was triggered. It doesn't do anything with the message itself - the message is merely the triggering mechanism for the Lambda function to run and log the instance status.

like image 130
tgodfrey Avatar answered Nov 03 '22 14:11

tgodfrey


Sadly Amazon does not provide a health check log, so its impossible to find out which instance failed the health check afterwards, assuming that the server is no longer unhealthy. You can only use Per-Az metrics to know in which AZ is the instance.

But, you could know which instance is down if you query AWS api during the issue. So, I have thought of a possible workaround:

  • Set up a new SNS topic, and add an HTTP action to a custom URL that triggers a job that enumerates the instances and send you that info by mail.
  • Then setup a CloudWatch alarm for UnHealthyHostCount > 0 and setup the action to the SNS topic.

The difficult part is that your URL should handle the SNS subscription & confirmation described here.

The command to know which instance is currently OutOfService is:

elb-describe-instance-health *LoadBalancerName* --region *YourRegion*
like image 36
Gerardo Grignoli Avatar answered Nov 03 '22 13:11

Gerardo Grignoli