Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

zookeeper installation on multiple AWS EC2instances

I am new to zookeeper and aws EC2. I am trying to install zookeeper on 3 ec2 instances.

as per zookeeper document, I have installed zookeeper on all 3 instances, created zoo.conf and add below configuration:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
clientPort=2181
server.1=localhost:2888:3888
server.2=<public ip of ec2 instance 2>:2889:3889
server.3=<public ip of ec2 instance 3>:2890:3890

also I have created myid file on all 3 instances as /opt/zookeeper/data/myid as per guideline..

I have couple of queries as below:

  1. whenever I am starting zookeeper server on each instance, it will start in standalone mode.(as per logs)

  2. can above configuration is really gonna connect to each other? port 2889:3889 & 2890:38900 - what these port all about. can I need to configure it on ec2 machine or I need to give some other port against it?

  3. Is I need to create security group to open these connection? I am not sure how to do it in ec2 instance.

  4. How to confirm all 3 zookeeper has started and they can communicate with each other?

like image 983
Bharat Avatar asked Mar 28 '15 22:03

Bharat


2 Answers

The ZooKeeper configuration is designed such that you can install the exact same configuration file on all servers in the cluster without modification. This makes ops a bit simpler. The component that specifies the configuration for the local node is the myid file.

The configuration you've defined is not one that can be shared across all servers. All of the servers in your server list should be binding to a private IP address that is accessible to other nodes in the network. You're seeing your server start in standalone mode because you're binding to localhost. So, the problem is the other servers in the cluster can't see localhost.

Your configuration should look more like:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
clientPort=2181
server.1=<private ip of ec2 instance 1>:2888:3888
server.2=<private ip of ec2 instance 2>:2888:3888
server.3=<private ip of ec2 instance 3>:2888:3888

The two ports listed in each server definition are respectively the quorum and election ports used by ZooKeeper nodes to communicate with one another internally. There's usually no need to modify these ports, and you should try to keep them the same across servers for consistency.

Additionally, as I said you should be able to share that exact same configuration file across all instances. The only thing that should have to change is the myid file.

You probably will need to create a security group and open up the client port to be available for clients and the quorum/election ports to be accessible by other ZooKeeper servers.

Finally, you might want to look in to a UI to help manage the cluster. Netflix makes a decent UI that will give you a view of your cluster and also help with cleaning up old logs and storing snapshots to S3 (ZooKeeper takes snapshots but does not delete old transaction logs, so your disk will eventually fill up if they're not properly removed). But once it's configured correctly, you should be able to see the ZooKeeper servers connecting to each other in the logs as well.

EDIT

@czerasz notes that starting from version 3.4.0 you can use the autopurge.snapRetainCount and autopurge.purgeInterval directives to keep your snapshots clean.

@chomp notes that some users have had to use 0.0.0.0 for the local server IP to get the ZooKeeper configuration to work on EC2. In other words, replace <private ip of ec2 instance 1> with 0.0.0.0 in the configuration file on instance 1. This is counter to the way ZooKeeper configuration files are designed but may be necessary on EC2.

like image 196
kuujo Avatar answered Sep 28 '22 21:09

kuujo


Adding additional info regarding Zookeeper clustering inside Amazon's VPC.

Solution with VPC's public IP addres should be preferable solution since Zookeeper and using '0.0.0.0' should be your last option. In case when you are using docker in your EC2 instance '0.0.0.0' will not work properly with Zookeeper 3.5.X after node restart.

The issue lies in resolving '0.0.0.0' and ensemble sharing of node addresses and SID order (if you will start your nodes in descending order, this issue may not occur).

So far the only working solution is to upgrade to 3.6.2+ version.

like image 27
Kucera.Jan.CZ Avatar answered Sep 28 '22 23:09

Kucera.Jan.CZ