Can Apache Mesos 'master' nodes be co-located on the same machine as Mesos 'slave' nodes? Similarly (for high-availability (HA) deploys), can the Apache Zookeeper nodes used in Mesos 'master' election be deployed on the same machines as Mesos 'slave' nodes?
Mesos recommends 3 'masters' be used for HA deploys, and Zookeeper recommends 5 nodes be used for its quorum election system. It would be nice to have these services running along side Mesos 'slave' processes instead of committing 8 machines to effectively 'non-productive' tasks.
If such a setup is feasible, what are the pros/cons of such a setup?
Thanks!
Apache Mesos is an open source cluster manager that handles workloads in a distributed environment through dynamic resource sharing and isolation. Mesos is suited for the deployment and management of applications in large-scale clustered environments.
Mesos consists of a master daemon that manages agent daemons running on each cluster node, and Mesos frameworks that run tasks on these agents. The master enables fine-grained sharing of resources (CPU, RAM, …) across frameworks by making them resource offers.
The Mesos slave. The Mesos slaves are responsible for executing tasks from frameworks using the resources they have. The slave has to provide proper isolation while running multiple tasks. The isolation mechanism should also make sure that the tasks get resources that they are promised, and not more or less.
You can definitely run a master, slave, and zk process all on the same node. You can even run multiple master and slave processes on the same node, provided you give them each unique ports, but that's only useful for a test cluster.
Typically we recommend running ZK on the same nodes as your masters, but if you have extra ZKs, you can certainly run them on slaves, or mix-and-match as you see fit, as long as all master/slave/framework nodes can reach the ZK nodes, and all slaves can reach the masters.
For a smaller cluster (<10 nodes) it could make sense to run a slave process on each master, especially since the standby masters won't be doing much. Even an active master for a small cluster uses only a small amount of cpu, memory, and network resources. Just make sure you adjust the --resources on that slave to account for the master's resource usage.
Once your cluster grows larger (especially >100 nodes) the network traffic to/from the master as well as its cpu/memory utilization becomes significant enough that you wouldn't want to run a mesos slave on the same node as the master. It should be fine to co-locate ZK with your master even at large scale.
You didn't specifically ask, but I'll also discuss where to run your framework schedulers (e.g. Spark, Marathon, or Chronos). These could be co-located with any of the other components, but they only really need to be able to reach the master and zk nodes, since all communication to slaves goes through the master. Some customers run the schedulers on master nodes, some run them on edge nodes (so users don't have access to the slaves), and others use meta-frameworks like Marathon to run other schedulers on slaves as Mesos tasks.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With