Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS Elastic Beanstalk health check issue

My web application is Django and web server use Nginx, use Docker image and Elastic Beanstalk for deployment.

Normally there was no problem, but as the load balancer expands EC2, my web server becomes 502 Bad Gateway.

I checked Elastic Beanstalk application logs, about 16% of the requests returned 5xx errors, at which time the load balancer expands EC2, causing the web server to transition to the 502 Bad Gateway state and the Elastic Beanstalk application to the Degraded state.

Is this a common problem when the load balancer performs a health check? If not, how to turn off the Health Check?

I am attaching a captured image for reference.

enter image description here

like image 827
Seung Avatar asked May 29 '18 02:05

Seung


People also ask

Why is ELB health check failing?

An instance might fail the ELB health check because an application running on the instance has issues that cause the load balancer to consider the instance out of service.

How does Elastic Beanstalk check health?

The health agent monitors web server logs and system metrics and relays them to the Elastic Beanstalk service. Elastic Beanstalk analyzes these metrics and data from Elastic Load Balancing and Amazon EC2 Auto Scaling to provide an overall picture of an environment's health.

Why is my Elastic Beanstalk health red?

A yellow or red health status warning in your Elastic Beanstalk environment can result from some of these common issues: The health agent is reporting an insufficient amount of data on an Amazon Elastic Compute Cloud (Amazon EC2) instance. An operation is in progress on an instance within the command timeout.

What ELB will do if one of the instance fails health check?

Health check failed The instance will continue being monitored and if it starts failing health checks, the ELB will respond by marking it as unhealthy, stop routing traffic to it, and wait for the ASG to replace it.


2 Answers

As far as I know, 502 Bad Gateway error can be mitigated only by manually checking the major links you have on your websites and if they are accessible through a simple GET request.

In case of my website, I had some issue with the login page and an about page, (and it was about 33% of my website sadly) which is why after uploading to EC2 i got a 5xx error on health check. I solved the problem by simply making the links work on the server (there were some functionalities which were only running on localhost and not on AWS so I fixed that and got OK status in Health Check)

I don't think there is a point in removing health check as it gives vital information about your website and you probably don't want your website to have inaccessible pages.

Keep track of logs to narrow down to the problem.

I hope you find the solution.

like image 190
Aravind Avatar answered Sep 24 '22 10:09

Aravind


While your code is being deployed, you will get 502 because the EC2 instance fails the health check call. You need to adjust the load balance health check default settings to allow enough time for your deployment to complete. Allow more time for a deployment if you also restart the server after each deployment.

The AWS load balancer sends a health check request to each registered instance every N seconds using the path you specify. The default interval seconds is 30 seconds. If the health check fails N number of times (default is 2) for any of the instances you have running, health changes to Degraded or Severe depending on the percentage of your instances that are not responding.

  1. Send a request that should return a 200 response code. Default is '/index.html'
  2. Wait for N seconds before time out (default 5 seconds)
  3. Try again after N interval seconds (default 30 seconds)
  4. If N consecutive calls fail, change the health state to warning or severe (default unhealthy threshold is 2)
  5. After N consecutive successful calls, return the health state to OK (default is 10).

With the default settings, if any web server instance is down for more than a minute (2 tries of 30 seconds each), it is considered an outage. It will take 5 minutes (10 tries every 30 seconds) to get back to Ok status.

For a detailed explanation and configuration options please check AWS documentation: Configure Health Checks for Elastic Load Balancing

like image 32
Saeed D. Avatar answered Sep 23 '22 10:09

Saeed D.