Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache Ignite vs. Apache Storm (in-depth)

Apache Ignite and Apache Storm are two rather different technologies in many aspects - especially since Storm has one very specific use-case, while Ignite has quite a large set of tools under one roof. As I understand it, the core of Ignite is its in-memory storage. Built on that is its data locality sensitive computation. Built on that are all kinds of cool "toys". The one I am interested in is the Streaming functionality, which is basically a querying listener on the changing in-memory cache.

If I set the sliding window to one tuple, Ignite provides - like Storm - a one-tuple-at-a-time functionality. The data is stored in memory by Ignite. Storm does not "store" the data in an in-memory sense, but the tuples are of course also stored in memory. So in both cases I have streaming and I have data in memory and I am able to distribute my computation.

I get a sense that writing programs that do many steps of data transformations might be easier to write in Storm, due to the abstractions of both technologies. What is to say about that?

Second question: What about the performance? I'd guess Ignite's data locality might give it an advantage. On the other hand I think multiple steps might be better distributed in Storm (different bolts on all kinds of machines), while an Ignite program might not be split so easily.

If I still wanted to distribute the stream (not just per data, but also the steps on different machines). I guess I would have to write multiple Ignite streamers, which communicate through Caches, right? This would sound more difficult to write than in Storm (bringing us back to the first question).

like image 936
Make42 Avatar asked Nov 26 '15 10:11

Make42


1 Answers

I get a sense that writing programs that do many steps of data transformations might be easier to write in storm, due to the abstractions of both technologies. What is to say about that?

You are probably right about that. It does seem like multiple transformations would be easier in storm, although Ignite also has decent support for it by streaming newly produced tuples into another cache.

What about the performance? I'd guess the Ignite's data locality might give it an advantage. On the other hand I think multiple steps might be better distributed in Storm (different bolts on all kinds of machines), while an Ignite program might not be split so easily.

From what I hear within the community, Ignite should be an order of magnitude faster than Storm.

If I still wanted to distribute the stream (not just per data, but also the steps on different machines). I guess I would have to write multiple Ignite streamers, which communicate through Caches, right?

Yes, you are right. Having multiple caches in Ignite is not a bad thing, and is actually recommended. Most users end up having a dozen or two.

This would sound more difficult to write than in Storm (bringing us back to the first question).

It sounds like you need to decide how important the performance is for you.

like image 63
Dmitriy Avatar answered Sep 25 '22 04:09

Dmitriy