Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storm vs. Trident: When not to use Trident?

I'm working with Storm and it is fine for a lot of use cases. Recently I had a look at Trident, which is a high-level abstraction of Storm. It supports exactly-once processing and makes stateful processing easier.

But now I'm wondering.. Why can't I always use Trident instead of Storm?

What I read so far:

  • Trident processes messages in batches, so throughput time could be longer.
  • Trident is not yet able to process loops in topologies.

Are there any other disadvantages when using Trident instead of Storm? Because right now, I think the disadvantages I listed above are marginal.

What use cases cannot be implemented with Trident?


Aftermath:

Since I asked the question my company decided to go for Trident first. We will only use pure Storm when there are performance problems. Sadly this wasn't an active decision it just became the default behavior (I wasn't around at that time).

Their assumption was that in most use cases we need state or only-once-processing or we will need it in near future. I understand their reasoning because moving from Storm to Trident or back isn't an easy transformation, but in my personal opinion the concept of stream processing without state wasn't understood by all and that was the main reason to use Trident.

like image 874
Christian Strempfer Avatar asked Mar 20 '13 10:03

Christian Strempfer


4 Answers

To answer your question: when shouldn't you use Trident? Whenever you can afford not to.

Trident adds complexity to a Storm topology, lowers performance and generates state. Ask yourself the question: do you need the "exactly once" processing semantics of Trident or can you live with the "at least once" processing semantics of Storm. For exactly once, use Trident, otherwise don't.

I would also just like to highlight the fact that Storm guarantees that all messages will be processed. Some messages might just be processed more than once.

like image 59
John Gilmore Avatar answered Nov 13 '22 19:11

John Gilmore


If the lowest possible latency is your goal and you don't need exactly-once processing, then using Storm is better than Trident.

like image 20
ChrisBlom Avatar answered Nov 13 '22 18:11

ChrisBlom


Trident is a high-level abstraction for doing realtime computing on top of Twitter Storm, available in Storm 0.8.x. Storm is stateless stream processing framework and Trident provides stateful stream processing.

like image 28
Do Do Avatar answered Nov 13 '22 19:11

Do Do


Chris, since these two of them are open source technologies, trident serves as an only an implementation of a scenario on top of the storm, of course, this brought a performance overhead. If the trident could not meet your requirements, you create your own state implementation on top of the storm. Trident yielded higher level projects such as Trident-ML in time.

like image 26
HakkiBuyukcengiz Avatar answered Nov 13 '22 19:11

HakkiBuyukcengiz