I want to fire multiple web requests in parallel and then aggregate the data in a storm topology? which of the following way is preferred
1) create multiple threads within a bolt
2) Create multiple bolts and create a merging bolt to aggregate the data.
I would like to create multiple threads within a bolt because merging data in another bolt is not a simple process. But i see there are some concerns around that I found on internet https://mail-archives.apache.org/mod_mbox/storm-user/201311.mbox/%3CCAAYLz+pUZ44GNsNNJ9O5hjTr2rZLW=CKM=FGvcfwBnw613r1qQ@mail.gmail.com%3E but didn't get clear reason why not to create multiple threads. Any pointers will help.
On a side note does that mean i should not use java8's capabilities of parallel streams as well as mentioned in https://docs.oracle.com/javase/tutorial/collections/streams/parallelism.html?
Increase number of tasks for the bolt, its like spawning multiple instances of the same. And also increase the number of executors (threads) to handle them evenly.
Make sure #executors <= #tasks. Storm will do the rest for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With