RabbitMQ Cluster is not accepting new connections! The number of sockets connected is low, the only message in Rabbitmq log is:
** WARNING ** Mnesia is overloaded
What does that mean? How can I solve it?
To be able to recover in case RabbitMQ goes down our queues are durable and all our messages are marked as persistent. We generally have a very low number of messages in flight at any moment in time.
Since our RabbitMQ was configured to persist all the messages this should be generally possible. Surely I wouldn’t be the first one to attempt this. ?
In the following you’ll see the following placeholders: RABBITMQ_MNESIA_BASE will be /var/lib/rabbitmq/mnesia on Debian (see RabbitMQ’S file locations)
deploy the cluster with a fixed user / password / cookie. delete the pods one-by-one to recover, starting with -0. It is unsafe to try to change the user/password after deployment and rabbitmq only inserts these users if no data is present, healthchecks start to fail and nodes can not recover.
You need to increase dc_dump_limit
. Based on docs:
-mnesia dc_dump_limit Number. Controls how often disc_copies tables are dumped from memory. Tables are dumped when
filesize(Log) > (filesize(Tab)/Dc_dump_limit)
. Lower values reduce CPU overhead but increase disk space and startup times. Default is 4.
So starting Erlang with proper `-mnesia dc_dump_limit=X might help the situation to fix.
This may occur in a bundle of circumstances, from the machine being suspended to the runtime scheduler choosing to first give Mnesia forms no opportunity to run and after that, a great deal of time (with the goal that heaps of occasions fire in an extremely concise timeframe).
On the off chance that you can, consider using Erlang 17.x. This isn't an indication of an issue generally.
See http://streamhacker.com/2008/12/10/how-to-eliminate-mnesia-overload-events/ for more detailed info.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With