Can I use multiple nodes as cluster master?
Why should I do this? Maybe for distributing queries?
Another question: can master node be a smallest machine than data nodes? My current cluster is:
n1 - 8gb ram, 4 cpu - (x) master - ( ) data
n2 - 4gb ram, 2 cpu - ( ) master - (x) data
n3 - 4gb ram, 2 cpu - ( ) master - (x) data
n4 - 4gb ram, 2 cpu - ( ) master - (x) data
n5 - 4gb ram, 2 cpu - ( ) master - (x) data
All my queries are sent to N1, and I see in HTOP that master node is always easily and fresh CPU/RAM usage and data nodes gets most of cpu/ram usage.
You can launch an EMR cluster with multiple master nodes in both public and private VPC subnets.
Three dedicated master nodes, the recommended number, provides two backup nodes in the event of a master node failure and the necessary quorum (2) to elect a new master. Four dedicated master nodes are not better than three and can cause issues if you use multiple Availability Zones.
You can have as many data nodes, ingest nodes, machine learning nodes, etc.
But to run multiple nodes in the same hosts you need to have a different elasticsearch. yml for every node with separated data and log folders, there isn't a way to use the same elasticsearch. yml to run multiple nodes at the same time.
Answer 1) You cannot have more than one master node.
Answer 2) Consider you have 3 nodes n1, n2 and n3 that all contain data, and currently n1 is selected as the master master node. If you query in n2 node the query will be distributed to all corresponding shards of indexes[replica shard or primary shard]. The result from each shards are combined and return back to you (see the query phase docs).
It's not necessary to distribute the query by master node. Any node data or master or non data node can act as router[Distributing search queries].
Answer 3) yes the master node can be small if the node does not contain data because it need not take care of data management.Its only work is to just route the queries to corresponding nodes and return the result to you. If the master node contains data then you should have configuration more than a data node because it has 2 jobs [data management,routing query]..
You can not have multiple masters running in a cluster, BUT you can set mulltiple nodes so that they can be elected as a master, when the current master goes down.
See also the discovery.zen.minimum_master_nodes setting for more explanation. There you can also find that it's better to have 1 electable master node than 2 (you should have 1 or 3+).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With