I'm following http://jayatiatblogs.blogspot.com/2011/11/storm-installation.html & http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html#sc_zkMulitServerSetup to set up Apache Storm cluster in Ubuntu 14.04 LTS at AWS EC2.
My master node is 10.0.0.185. My slave nodes are 10.0.0.79, 10.0.0.124 & 10.0.0.84 with myid of 1, 2 and 3 in their zookeeper-data respectively. I set up an ensemble of Apache Zookeeper consists of all the 3 slave nodes.
Below are my zoo.cfg for my slave nodes:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/home/ubuntu/zookeeper-data
clientPort=2181
server.1=10.0.0.79:2888:3888
server.2=10.0.0.124:2888:3888
server.3=10.0.0.84:2888:3888
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
Below are my storm.yaml for my slave nodes:
########### These MUST be filled in for a storm configuration
storm.zookeeper.server:
- "10.0.0.79"
- "10.0.0.124"
- "10.0.0.84"
# - "localhost"
storm.zookeeper.port: 2181
# nimbus.host: "localhost"
nimbus.host: "10.0.0.185"
storm.local.dir: "/home/ubuntu/storm/data"
java.library.path: "/usr/lib/jvm/java-7-oracle"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
- 6704
#
# worker.childopts: "-Xmx768m"
# nimbus.childopts: "-Xmx512m"
# supervisor.childopts: "-Xmx256m"
#
# ##### These may optionally be filled in:
#
## List of custom serializations
# topology.kryo.register:
# - org.mycompany.MyType
# - org.mycompany.MyType2: org.mycompany.MyType2Serializer
#
## List of custom kryo decorators
# topology.kryo.decorators:
# - org.mycompany.MyDecorator
#
## Locations of the drpc servers
# drpc.servers:
# - "server1"
# - "server2"
## Metrics Consumers
# topology.metrics.consumer.register:
# - class: "backtype.storm.metric.LoggingMetricsConsumer"
# parallelism.hint: 1
# - class: "org.mycompany.MyMetricsConsumer"
# parallelism.hint: 1
# argument:
# - endpoint: "metrics-collector.mycompany.org"
Below are the storm.yaml for my master node:
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
- "10.0.0.79"
- "10.0.0.124"
- "10.0.0.84"
# - "localhost"
#
storm.zookeeper.port: 2181
nimbus.host: "10.0.0.185"
# nimbus.thrift.port: 6627
# nimbus.task.launch.secs: 240
# supervisor.worker.start.timeout.secs: 240
# supervisor.worker.timeout.secs: 240
ui.port: 8772
# nimbus.childopts: "‐Xmx1024m ‐Djava.net.preferIPv4Stack=true"
# ui.childopts: "‐Xmx768m ‐Djava.net.preferIPv4Stack=true"
# supervisor.childopts: "‐Djava.net.preferIPv4Stack=true"
# worker.childopts: "‐Xmx768m ‐Djava.net.preferIPv4Stack=true"
storm.local.dir: "/home/ubuntu/storm/data"
java.library.path: "/usr/lib/jvm/java-7-oracle"
# supervisor.slots.ports:
# - 6700
# - 6701
# - 6702
# - 6703
# - 6704
# worker.childopts: "-Xmx768m"
# nimbus.childopts: "-Xmx512m"
# supervisor.childopts: "-Xmx256m"
# ##### These may optionally be filled in:
#
## List of custom serializations
# topology.kryo.register:
# - org.mycompany.MyType
# - org.mycompany.MyType2: org.mycompany.MyType2Serializer
#
## List of custom kryo decorators
# topology.kryo.decorators:
# - org.mycompany.MyDecorator
#
## Locations of the drpc servers
# drpc.servers:
# - "server1"
# - "server2"
## Metrics Consumers
# topology.metrics.consumer.register:
# - class: "backtype.storm.metric.LoggingMetricsConsumer"
# parallelism.hint: 1
# - class: "org.mycompany.MyMetricsConsumer"
# parallelism.hint: 1
# argument:
# - endpoint: "metrics-collector.mycompany.org"
I start my zookeeper in all my slave nodes, then start my storm nimbus in my master node, then start storm supervisor in all my slave nodes. However, when I view in my Storm UI, there is only 1 supervisor with total 5 slots in the cluster summary & only 1 supervisor information in the supervisor summary, why so?
How many slave nodes is actually working if I submit a topology in this case?
Why it is not 3 supervisors with total 15 slots?
What should I do in order to have 3 supervisors?
When I check in the supervisor.log in the slave nodes, the causes is as below:
2015-05-29T09:21:24.185+0000 b.s.d.supervisor [INFO] 5019754f-cae1-4000-beb4-fa0
16bd1a43d still hasn't started
There are two kinds of nodes on a Storm cluster: the master node and the worker nodes. The master node runs a daemon called "Nimbus" that is similar to Hadoop's "JobTracker". Nimbus is responsible for distributing code around the cluster, assigning tasks to machines, and monitoring for failures.
Storm uses Zookeeper for coordinating the cluster. Zookeeper is not used for message passing, so the load Storm places on Zookeeper is quite low. Single node Zookeeper clusters should be sufficient for most cases, but if you want failover or are deploying large Storm clusters you may want larger Zookeeper clusters.
To install Storm locally, download a release from here and unzip it somewhere on your computer. Then add the unpacked bin/ directory onto your PATH and make sure the bin/storm script is executable. Installing a Storm release locally is only for interacting with remote clusters.
What you are doing perfect and its works too.
The only thing you should change is your storm.dir
. It is same in the slave and the master nodes just change the path in the storm.dir
path in nimbus & supervisor nodes (don't use same local path). When you use same local path the nimbus and supervisor share same id. They come into play but you don’t see 8 slots they just show you 4 slots as workers.
Change the (storm.local.dir:/home/ubuntu/storm/data
) and don`t use same path in supervisor and nimbus.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With