I am new in storm and trying to understand the flow of execution of different methods from spout
to bolt
.
Like spout has different methods like
nextTuple()
open()
declareOutputFields()
activate()
deactivate()
and bolt has methods like
prepare()
execute()
cleanup()
declareOutputFields()
so can anyone tell me the sequence of execution of these methods ?
First you create a LocalDRPC object. This object simulates a DRPC server in process, just like how LocalCluster simulates a Storm cluster in process. Then you create the LocalCluster to run the topology in local mode. LinearDRPCTopologyBuilder has separate methods for creating local topologies and remote topologies.
Apache Storm works for real-time data just as Hadoop works for batch processing of data (Batch processing is the opposite of real-time. In this, data is divided into batches, and each batch is processed. This isn't done in real-time.)
Apache Storm: Apache Storm is a real-time message processing system, and you can edit or manipulate data in real-time. Storm pulls the data from Kafka and applies some required manipulation. It makes it easy to reliably process unbounded streams of data, doing real-time processing what Hadoop did for batch processing.
The main method in bolts is the execute method which takes in as input a new tuple. Bolts emit new tuples using the OutputCollector object.
First, when your topology is started...
declareOutputFields
Second, in each worker somewhere on the cluster...
open
and Bolts prepare
(happens once)ack
, fail
, and nextTuple
execute
If your topology is deactivated...
deactivate
method will be called. When you activate the topology again then activate
will be called.If your topology is killed...
close
calledcleanup
calledNote:
There is no guarentee that close will be called, because the supervisor kill -9's worker processes on the cluster. source
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With