When the Mesos scheduler (or slave) is on a different machine than the Mesos master, it keeps trying to connect to the master but gets disconnected. This cycle repeats continuously. How to fix this problem?
Both the framework (and slaves) and master need to be able to talk to each other. IOW, if one of the end points uses a private IP (e.g., 127.0.0.1) then it wouldn't work. If you want the master/slave to use a public ip you can use --ip flag. For the framework, you can set LIBPROCESS_IP in the environment.
we need a bit more information to go on - it sounds like you aren't advertising the slave on an IP the master can get to.
As mentioned above, a slave will happily advertise it's IP address as 127.0.0.1/localhost which obviously isn't reachable from the master unless they're on the same server. This should show up in the master and slave logs, so check those.
firewalls can also be an issue, so try after disabling those to rule them out.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With