Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Task.Yield to overcome ThreadPool starvation while implementing producer/consumer pattern

Answering the question: Task.Yield - real usages? I proposed to use Task.Yield allowing a pool thread to be reused by other tasks. In such pattern:

  CancellationTokenSource cts;
  void Start()
  {
        cts = new CancellationTokenSource();

        // run async operation
        var task = Task.Run(() => SomeWork(cts.Token), cts.Token);
        // wait for completion
        // after the completion handle the result/ cancellation/ errors
    }

    async Task<int> SomeWork(CancellationToken cancellationToken)
    {
        int result = 0;

        bool loopAgain = true;
        while (loopAgain)
        {
            // do something ... means a substantial work or a micro batch here - not processing a single byte

            loopAgain = /* check for loop end && */  cancellationToken.IsCancellationRequested;
            if (loopAgain) {
                // reschedule  the task to the threadpool and free this thread for other waiting tasks
                await Task.Yield();
            }
        }
        cancellationToken.ThrowIfCancellationRequested();
        return result;
    }

    void Cancel()
    {
        // request cancelation
        cts.Cancel();
    }

But one user wrote

I don't think using Task.Yield to overcome ThreadPool starvation while implementing producer/consumer pattern is a good idea. I suggest you ask a separate question if you want to go into details as to why.

Anybody knows, why is not a good idea?

like image 503
Maxim T Avatar asked Oct 16 '22 10:10

Maxim T


1 Answers

There are some good points left in the comments to your question. Being the user you quoted, I'd just like to sum it up: use the right tool for the job.

Using ThreadPool doesn't feel like the right tool for executing multiple continuous CPU-bound tasks, even if you try to organize some cooperative execution by turning them into state machines which yield CPU time to each other with await Task.Yield(). Thread switching is rather expensive; by doing await Task.Yield() on a tight loop you add a significant overhead. Besides, you should never take over the whole ThreadPool, as the .NET framework (and the underlying OS process) may need it for other things. On a related note, TPL even has the TaskCreationOptions.LongRunning option that requests to not run the task on a ThreadPool thread (rather, it creates a normal thread with new Thread() behind the scene).

That said, using a custom TaskScheduler with limited parallelism on some dedicated, out-of-pool threads with thread affinity for individual long-running tasks might be a different thing. At least, await continuations would be posted on the same thread, which should help reducing the switching overhead. This reminds me of a different problem I was trying to solve a while ago with ThreadAffinityTaskScheduler.

Still, depending on a particular scenario, it's usually better to use an existing well-established and tested tool. To name a few: Parallel Class, TPL Dataflow, System.Threading.Channels, Reactive Extensions.

There is also a whole range of existing industrial-strength solutions to deal with Publish-Subscribe pattern (RabbitMQ, PubNub, Redis, Azure Service Bus, Firebase Cloud Messaging (FCM), Amazon Simple Queue Service (SQS) etc).

like image 160
noseratio Avatar answered Dec 11 '22 09:12

noseratio