Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple Short-lived TPL Dataflows versus Single Long-Running Flow

I'm using TPL dataflow to process items off a queue in an Azure worker role. Should I have a single long running dataflow, or spawn a new flow for every messages I receive?

If an error is thrown in a block, that block will stop accepting new messages. That means if there is an exception in a block, the whole dataflow will stop processing.

I need to be able to withstand exception from something like invalid queue inputs without locking my dataflow. I see one of two options:

  1. I have a start a single dataflow and send messages to it as they come off the queue. The contents of each block is wrapped in a try-catch block that log the exception, then continue processing. This seems clumsy and I assume there's a better way.
  2. For each message I start a new dataflow and process the queue message. If an exception is thrown in any block, the dataflow will complete, and I only recover a single message. Most Dataflow examples I've seen send multiple messages, so this doesn't feel right either.

I've seen lots of documentation on how to complete a dataflow after an exception, but very little on how to recover from exceptions.

like image 531
pnewhook Avatar asked Nov 01 '22 22:11

pnewhook


1 Answers

You should definitely go with the first option and have only one flow.

In the second option there isn't any added value of using a dataflow over just calling several methods one after the other. There is also an overhead of creating a full dataflow flow for each and every item.

It's better to build the flow once, and use it throughout the app's lifetime. I don't think there's anything wrong with handling exceptions per blocks, but if you want to can let the whole flow fail and only then create a new one.

like image 71
i3arnon Avatar answered Nov 15 '22 05:11

i3arnon