Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does storm replay tuple which processing has timed out?

It's mentioned in the storm documentation, that storm replays tuple which processing has timed out. My question is if the storm do this automatically (without calling fail() on the origin spout) or is this rather responsibility of the origin spout to replay the tuple (the fail() is called and replay should be implemented inside or even somewhere externally)?

like image 785
oo_olo_oo Avatar asked May 09 '13 08:05

oo_olo_oo


2 Answers

In order to have a proper replay on a timeout, you must anchor the tuple with an id when you emit it from the spout. When the timeout occurs, whatever you used as an anchor is returned to the fail method (fail(object anchorId)). Now you can use the anchorId of the failed/timedout tuple to replay or anything else you want to do with the timeout tuple. Each anchor id must be unique. An example of an anchor id is a database id. When you tuple fails, you can use the databse id to recreate your tuple and re-emit it. So to answer your question you must have your replay logic inside the fail and you can use the anchorId to recreate your tuple. Hope this info helps

like image 192
Naresh Avatar answered Sep 17 '22 22:09

Naresh


From http://storm.apache.org/documentation/Guaranteeing-message-processing.html,

if the tuple times-out Storm will call the fail method on the Spout

So yes, fail will be called.

like image 33
G Gordon Worley III Avatar answered Sep 21 '22 22:09

G Gordon Worley III