I am not understanding how I would split a stream in Apache Storm. For example, I have bolt A that after some computation has somevalue1, somevalue2, and somevalue3. It wants to send somevalue1 to bolt B, somevalue2 to bolt C, and somevalue1,somevalue2 to bolt D. How would I do this in Storm? What grouping would I use and what would my topology look like? Thank you in advance for your help.
There are eight built-in stream groupings in Storm, and you can implement a custom stream grouping by implementing the CustomStreamGrouping interface: Shuffle grouping: Tuples are randomly distributed across the bolt's tasks in a way such that each bolt is guaranteed to get an equal number of tuples.
The Storm topology is basically a Thrift structure. TopologyBuilder class provides simple and easy methods to create complex topologies. The TopologyBuilder class has methods to set spout (setSpout) and to set bolt (setBolt). Finally, TopologyBuilder has createTopology to create topology.
Basically, a bolt is the processing powerhouse of a Storm topology and is responsible for transforming a stream.
The Nimbus node is the master in a Storm cluster. It is responsible for distributing the application code across various worker nodes, assigning tasks to different machines, monitoring tasks for any failures, and restarting them as and when required. Nimbus is stateless and stores all of its data in ZooKeeper.
You can use different streams if your case needs that, it is not really splitting, but you will have a lot of flexibility, you could use it for content based routing from a bolt for instance:
You declare the stream in the bolt:
@Override public void declareOutputFields(final OutputFieldsDeclarer outputFieldsDeclarer) { outputFieldsDeclarer.declareStream("stream1", new Fields("field1")); outputFieldsDeclarer.declareStream("stream2", new Fields("field1")); }
You emit from the bolt on the chosen stream:
collector.emit("stream1", new Values("field1Value"));
You listen to the correct stream through the topology
builder.setBolt("myBolt1", new MyBolt1()).shuffleGrouping("boltWithStreams", "stream1"); builder.setBolt("myBolt2", new MyBolt2()).shuffleGrouping("boltWithStreams", "stream2");
You have two options here: Stream Groups and "Direct Grouping". Depending on your requirements, one of them is going to serves you.
Have a look at WordCountTopology sample project to see whether that is what you are looking for. Otherwise, "Direct Grouping" is going to be a better alternative.
But again, picking a grouping strategy depends on your requirements.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With