How does Storm assign tasks to its workers? How does load balancing work?
Storm assigns tasks to workers when you submit the topology via "storm jar ..."
A typical Storm cluster will have many Supervisors (aka Storm nodes). Each Supervisor node (server) will have many Worker processes running. The number of workers per Supervisor is determined by how many ports you assign with supervisor.slots.ports .
When the topology is submitted via "storm jar" the Storm platform determines which workers will host each of your spouts and bolts (aka tasks). The number of workers and executors which will host your topology is dependent on the "parallelism" that you set during development, when the topology is submitted, or changed in a live running topology using "storm rebalance".
Michael Noll has a great breakdown of Parallelism, Workers and Tasks in his blog post here: http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/#example-of-a-running-topology
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With