I have an elastic beanstalk worker environment that has transitioned to health "Severe" as of my latest deployment. The error it gives me is:
sqsd is in fault mode on all instances
How do I fix this/get more information about this?
The sqsd is in fault mode error can have different causes, e.g. health check may fail with http status code 400 or 500 depending on some underlying issue.
To find out more, you can ssh into the worker instance (e.g. EC2 management console>instances>RMB on the instance>connect), then try probing http://localhost/, using e.g. curl.
On one occasion, we got a similar sqsd is in fault mode error from our worker environment, with a status 400. This was due to an incorrect ALLOWED_HOSTS value in our (Django) settings.py.
On another occasion, we had a similar issue with a status 500 on our worker environment after trying to update to the latest Amazon Linux platform version. Note that our worker env had been running without any problems for many months, and we did not modify the application version, nor the environment configuration.
The logs (aws-sqsd/default.log) for the failed platform-update attempt show:
2018-10-19T09:06:52Z healthcheck-err: service healthcheck to URL "http://localhost/health/" failed with http status code "500"
whereas the logs from before the failed update attempt show this:
2018-10-19T08:38:43Z message: sent to http://localhost:80
Funny thing is, that, according to the AWS docs, workers should not even be able use health check urls (if I understand correctly...):
In a single instance or worker tier environment, Elastic Beanstalk determines the instance's health by monitoring its Amazon EC2 instance status. Elastic Load Balancing health settings, including HTTP health check URLs, cannot be used in these environment types. [my emphasis]
Strangely enough, our worker environment was configured, at the time, using the EB web console, with deployment policy "rolling with additional batch," using "Health based rolling updates" from the dropdown menu.
This seems to be in direct contradiction to the quote above, which means that our active configuration is actually invalid (even though the env has been running successfully for a long time).
Sure enough, if I now try to modify something (anything) in the environment configuration using the EB web console, I suddenly get an error that was never there before:
"Invalid option value: 'Health' (Namespace: 'aws:autoscaling:updatepolicy:rollingupdate', OptionName: 'RollingUpdateType'): Health based rolling updates can not be enabled for worker tier environments."
Moreover, the "Health based rolling updates" option no longer appears in the dropdown for "Rolling update type" (yet it was there before I tried to apply the change).
-- edit--
The issue described above was confirmed by AWS support.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With