Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write c++ spout/bolt on Storm and Thrift usage in Storm

From here: Storm was designed from the very beginning to be compatible with multiple languages. Nimbus is a Thrift service and topologies are defined as Thrift structures. The usage of Thrift allows Storm to be used from any language.

I see that a topology created in java get deployed by serializing the topology (spouts, bolts , ComponentCommon) as a Thrift datatypes and then gets deployed on Nimbus. In Java it is easy to serialize the object with its methods and data. So on the other side Nimbus just needs to create objects and invoke them. (i might be missing detail here but I hope I got the point correctly)

But I wonder how to write the topology in C++ and deploy it the same way. Does thrift help to serialize the c++ based topology and Nimbus deploys/executes the topology in the same way as for Java?

I have seen the links link1 link2 in this regard and the only solution seems to be using a Shelbolt. which invokes the process and communicates with it over standard i/o.

In order to use the Thrift way, do we need to rewrite the storm core also in C++? Also why use Thrift when it supports only JVM languages? Thrift doesn't seem to be used at all for languages like python/c++.

like image 249
weima Avatar asked Oct 01 '13 05:10

weima


1 Answers

I'm not sure if I'm understand your question correctly -- in my understanding you're asking Is it possible [without the Shebolt hack] to use Storm [with Thrift as comm protocol] with C++-written bolts and with C++ as the language that creates the topology.

Because of the lack of other answers to this question and based on my own research I assume there is no finished, usable implementation for your problem.

Therefore if you really have to use Storm (its common usecase is the JVM, so even if it could theoretically work with any language, it doesn't mean there is an ecosystem for other languages) and C++, you have no option but to use the Shebolt hack or modify Thrift yourself.

As you know, thrift itself has also been ported to C++. Therefore it is possible to re-build the API calls in C++. Basically, you'd need to port the Java TopologyBuilder. On the C++ side, you could start with the Thrift C++ tutorial.

This is also some kind of a hack, as you basically just rebuild half of the stack (in this case ontop of Thrift), but in general you have very few other options with a system design like Storm. For example, the MySQL binary protocol has been rebuilt from-scr

Unless anyone has done the work for you (which I would have completely missed in my research) I see no option than to do it yourself (maybe even storm is not the best tool for your usecase!?)

If another hack (which might be even more complex and maybe even slower) besides ShellBolt is good enough for you, you could try starting a JVM from inside C++, e.g. see this SO post. I would not recommend this.

If you need an alternative distributed task queue, I have had good experience with Celery in Python environments, however I have no experience in using it in C++ directly (I usually control Python with ZeroMQ, or write my own ZeroMQ-based queues where necessary, but this is no universal solution).

like image 187
Uli Köhler Avatar answered Oct 20 '22 01:10

Uli Köhler