Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TPL Dataflow: design for parallelism while keeping order

I have never worked with TPL before so I was wondering whether this can be done with it: My application creates a gif image animation file from a lot of frames. I start with a list of Bitmap which represents the frames of the gif file and need to do the following for each frame:

  1. paint a number of text/bitmaps onto the frame
  2. crop the frame
  3. resize the frame
  4. reduce the image to 256 colors

Obviously this process can be done in parallel for all the frames in the list but for each frame the order of steps needs to be the same. After that, I need to write all the frames to the gif file. Therefore all the frames need to be received in the same order they were in in the original list. On top of that, this process can start when the first frame is ready for it, there is no need to wait until all frames are processed.

So that's the situation. Is TPL Dataflow suitable for this? If yes, can anyone give me a hint in the right direction on how to design the tpl block structure to reflect the process explained above? It seems quite complex to me compared to some samples I've found.

like image 573
p4gefault Avatar asked Feb 03 '14 00:02

p4gefault


3 Answers

I think it makes sense to use TPL Dataflow for this, especially since it automatically keeps the processed elements in the right order, even with parallelism turned on.

You could create a separate block for each step in the process, but I think there is no need for that here, one block for processing the frames and one for writing them will be enough:

public Task CreateAnimationFileAsync(IEnumerable<Bitmap> frames)
{
    var frameProcessor = new TransformBlock<Bitmap, Bitmap>(
        frame => ProcessFrame(frame),
        new ExecutionDataflowBlockOptions
        { MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded });

    var animationWriter = new ActionBlock<Bitmap>(frame => WriteFrame(frame));

    frameProcessor.LinkTo(
        animationWriter,
        new DataflowLinkOptions { PropagateCompletion = true });

    foreach (var frame in frames)
    {
        frameProcessor.Post(frame);
    }

    frameProcessor.Complete();

    return animationWriter.Completion;
}

private Bitmap ProcessFrame(Bitmap frame)
{
    …
}

private async Task WriteFrame(Bitmap frame)
{
    …
}
like image 167
svick Avatar answered Sep 25 '22 10:09

svick


Your problem is a perfect example of where dataflow excels.

Here is the simplest code that can get you started.

// Try increasing MaxDegreeOfParallelism
var opt = new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 2 };

// Create the blocks
// You must define the functions to do what you want
var paintBlock = new TransformBlock<Bitmap, Bitmap>(fnPaintText, opt);
var cropBlock = new TransformBlock<Bitmap, Bitmap>(fnCrop, opt);
var resizeBlock = new TransformBlock<Bitmap, Bitmap>(fnResize, opt);
var reduceBlock = new TransformBlock<Bitmap, Bitmap>(fnReduce,opt);

// Link the blocks together
paintBlock.LinkTo(cropBlock);
cropBlock.LinkTo(resizeBlock);
resizeBlock.LinkTo(reduceBlock);

// Send data to the first block
// ListOfImages contains your original frames
foreach (var img in ListOfImages) { 
   paintBlock.Post(img);
}

// Receive the modified images
var outputImages = new List<Bitmap>();
for (int i = 0; i < ListOfImages.Count; i++) {
   outputImages.Add(reduceBlock.Receive());
}

// outputImages now holds all of the frames
// reassemble them in order
like image 45
Matt Carkci Avatar answered Sep 25 '22 10:09

Matt Carkci


I think you will find that DataFlow is the right way to go. For every frame, from your frame list, try to create one TransformBlock. For each of the four steps, chain together the frames in the correct order. If you want to process the framelist concurrently, you might use a bufferblock for the framelist.

please find the full sample on how to use transformblock on msdn:

like image 27
devhorse Avatar answered Sep 23 '22 10:09

devhorse